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Abstract 


Selective  enumeration  is  a  method  for  reducing  the  number  of  cases 
required  when  performing  a  generate-and-test  search  to  solve  rela¬ 
tional  formulae.  This  paper  gives  a  formal  definition  of  selective 
enumeration  and  using  that  definition,  proves  soundness  for  each  of 
the  selective  enumeration  techniques  developed. 


1.  Introduction 


Sets,  functions,  and  binary  relations  combine  to  provide  a  convenient,  yet  rigorous,  framework  for 
modeling  software  systems.  Z  [Spi92],  probably  the  most  widely  used  formal  notation  for 
describing  software  systems,  is  based  entirely  on  these  constructs.  As  sets,  functions,  and  relations 
can  all  be  described  using  relational  formulae,  I  use  the  term  relational  specification  to  describe 
any  specification  built  on  these  constructs. 

Other  software  description  notations  also  draw  much  of  their  expressive  power  from  these 
constructs.  Within  the  database  community,  the  inter-relationships  in  a  database  schema  are  often 
specified  using  an  entity-relationship  diagram  [Che? 6].  Given  the  name,  it  should  not  be 
surprising  that  entity-relationship  diagrams  can  be  clearly  and  succinctly  described  using 
relations. 

More  recently,  UML  [BJR97]  has  gathered  great  interest  in  the  object  community.  Although  UML 
combines  several  different  notations  to  describe  a  single  object  design,  many  of  these  notations 
are  built  from  sets,  functions,  and  binary  relations. 

Despite  the  broad  appeal  of  these  constructs,  little  automated  support  is  available  for  analyzing 
relational  specifications.  Theorem  provers  [ES94;  SM96]  can  help,  but  they  require  enormous 
manual  effort  and  provide  little  guidance  to  help  repair  faulty  specifications.  Model  checkers 
[BC+92;  CPS93]  can  analyze  system  specifications  based  on  other  formalisms,  but  no  model 
checkers  are  available  for  relational  specifications. 

1.1  Generate-and-Test  Searching 

A  method  for  solving  relational  formulae  must  lie  at  the  core  of  any  automated  tool  for  analyzing 
relational  specifications.  The  simplest  approach  is  a  generate-and-test  search.  A  generate-and-test 
search  generates  every  possible  mapping  of  variables  to  values,  called  assignments,  for  a 
particular  formula.  The  search  then  tests  each  generated  assignment  against  that  formula.  The 
result  of  the  search  is  a  set  of  satisfying  assignments,  that  is,  assignments  that  give  a  true 
interpretation  to  the  formula.  A  depth  first  search  can  trivially  generate  a  complete  set  of 
assignments,  with  each  level  of  the  search  tree  corresponding  to  a  distinct  variable  in  the  formula. 
Testing  a  single  assignment  against  a  formula  is  also  straightforward,  requiring  only  an 
implementation  of  the  standard  boolean,  set  and  relational  operations,  making  a  generate-and-test 
search  a  simple  solution. 

However,  using  generate-and-test  search  as  a  solver  presents  two  limitations.  By  its  nature,  a 
generate-and-test  search  will  consider  only  some  finite  subset  of  the  (generally  infinite)  possible 
assignment  space.  Although  this  limitation  prevents  a  generate-and-test  search  from  being  a  true 
verifier  for  infinite  problems,  it  does  not  remove  all  practical  applications.  As  I  believe  that  many, 
if  not  most,  errors  in  specifications  can  be  demonstrated  using  only  a  small  subset  of  the  entire 
assigmnent  space,  a  generate-and-test  search  can  be  the  basis  of  a  practical  specification  analysis 
tool. 

The  second  limitation  is  the  time  required  to  generate  and  test  all  the  assignments.  This  limitation 
has  far  more  significant  practical  implications.  Even  with  a  simple  specification  (such  as  finder 
[JD95])  limited  to  only  five  underlying  objects,  the  total  number  of  assignments  required  to 
generate  and  test  exceeds  10^^. 
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Generating  all  10^^  assignments  is  clearly  inconceivable,  rendering  naive  exhaustive  enumeration 
useless.  Fortunately,  the  vast  majority  of  these  assignments  are  in  some  sense  “duplicates”  of 
other  assignments.  One  assignment  may  be  a  permutation  of  another  assignment.  Or  two 
assignments  may  share  some  common  partial  assignment,  which  itself  determines  the 
interpretation  of  the  formula.  Regardless  of  the  nature  of  the  duplication,  generating  only  one 
assignment  from  each  set  of  duplicate  assignments  is  sufficient. 

Selective  enumeration  is  a  generate-and-test  search  method  that  prevents  the  generation  of  most 
duphcates.  By  preventing  the  generation  of  these  duphcates,  selective  enumeration  is  effective  in 
solving  many  interesting  relational  formulae. 

1.2  Alloc  —  An  Example 

This  section  introduces  a  very  simple  relational  specification,  which  will  be  used  to  illustrate 
points  throughout  the  remainder  of  this  paper.  This  simple  example  describes  a  heap  allocation 
system,  such  as  malloc,  in  very  general  terms. 

The  specification  is  written  in  NP  [JD96a],  a  relational  specification  language  that  is  roughly  a 
subset  of  Z.  NP  is  limited  to  first-order  objects,  so,  for  example,  there  are  no  functions  of 
functions.  Figure  1  contains  the  NP  specification  for  the  heap  allocation  system. 

The  first  line  of  the  example  introduces  the  two  given  types  used  in  this  specification,  Addr  and 
Value.  A  given  type  is  a  set  of  elements,  with  each  element  having  no  internal  structure.  Every 
element  is  contained  in  exactly  one  given  type.  All  variables  and  expressions  in  NP  are  typed, 
indicating  that  they  refer  to  one  of  three  kinds  of  values.  A  variable  or  expression  may  refer  to  (1) 
an  element  of  a  given  type,  (2)  a  set  of  elements  of  a  single  given  type,  or  (3)  a  relation  that  maps 
elements  of  a  given  type  (the  domain)  to  elements  of  a  given  type  (the  range).  Relations  can  be 
restricted  to  functions,  injections  or  bijections  and  they  can  be  restricted  to  total  or  surjective 
relations. 

When  using  NP,  specifiers  describe  their  system  using  a  collections  of  schemas,  which  allow  a 
simple  structuring  and  composition  of  individual  pieces  of  the  specification,  similar  to  the 
mechanism  provided  by  Z.  There  are  two  independent  characteristics  that  jointly  classify  schemas 
in  NP.  A  schema  is  either  a  definition,  which  defines  the  system  being  specified,  or  a  claim,  which 
makes  assertions  about  the  system  being  specified.  A  schema,  whether  a  definition  or  a  claim, 
refers  either  to  a  single  state  or  to  a  transition  between  two  states.  A  transitional  schema  is  called 
an  operation  and  describes  both  a  pre-state  and  a  post-state.  The  specification  given  in  Figure  1 
contains  examples  of  three  of  the  four  possible  combinations  of  these  characteristics,  as  explained 
in  the  following  paragraphs. 

All  schemas  have  the  same  basic  structure.  The  body  of  the  schema  comes  after  the  name  of  the 
schema  and  is  enclosed  in  square  brackets  ( [  ] ).  The  body  is  separated  into  two  sections  by  a 
single  vertical  bar  ( | ).  The  first  section  defines  the  variables  used  in  the  schema,  whereas  the 
second  section  gives  a  collection  of  relational  formulae  that  must  all  be  satisfied  in  any  system 
described  by  this  specification. 

In  the  example  given  in  Figure  1,  Heap  is  a  definitional  schema  that  describes  the  basic  structure 
of  a  heap.  Heap  introduces  two  variables,  usage  and  used.  The  variable  usage  denotes  a 
function  mapping  addresses  (elements  of  Addr)  to  their  values  (elements  of  Value).  The  other 
variable,  used,  denotes  a  set  that  contains  aU  of  the  addresses  currently  in  use.  Heap  also  defines 
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[Addr,  Value] 

Heap  = 

[ 

usage  :  Addr  ->  Value 
used :  set  Addr 
I 

/*  all  currently  mapped  addresses  are  used  */ 
used  =  dom  usage 

] 

Alloc(addr :  Addr)  = 

[ 

Heap 
I 

/*  Allocating  a  new  address  does  not  change  the  current  allocation  */ 
used  <:  usage'  =  usage 

/*  But  addr  is  now  mapped  (to  some  value  unknown)  */ 
used'  =  used  U  {addr} 

] 

uniqueAddrAlloc:: 

[ 

Heap 
a :  Addr 
I 

/*  A  newly  allocated  address  should  not  have  been  in  use  */ 

Alloc(a)  =>  a  not  in  used 

] 

Figure  1:  A  trivial  NP  specification  describing  a  heap  allocation  system.  Addr  and  Value  are  the 
given  types.  Heap  describes  the  basic  structure  being  manipulated.  Alloc  describes  an  allocation 
operation,  and  uniqueAddrAlloc  is  a  claim  about  the  specification. 

a  single  formula  that  describes  a  relationship  that  must  hold  in  all  valid  heaps:  the  set  of  addresses 
in  use  is  exactly  the  set  of  addresses  currently  mapped,  that  is,  the  domain  of  the  function  usage. 

Alloc  is  an  operation  that  describes  the  change  in  a  heap  when  a  new  piece  of  memory  is 
allocated.  As  Alloc  refers  to  Heap  in  its  declaration  section.  Alloc  inherits  aU  of  the  variables 
defined  by  Heap.  Within  Alloc,  the  pre-state  is  referenced  using  the  simple  variable  names, 
whereas  the  post-state  is  referenced  using  primed  variables,  such  as  usage'.  Operations  are 
indicated  by  the  presence  of  a  (possibly  empty)  parameter  list.  The  parameter  list  for  Alloc  defines 
a  single  parameter,  addr,  which  is  the  newly  allocated  address. 

There  are  two  formulas  within  Alloc.  The  first  (used  <:  usage'  =  usage)^  guarantees  that  the 
allocation  does  not  change  any  existing  mappings.  The  second  formula  (used'  =  used  U  addr) 
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indicates  that  the  newly  allocated  address  is  now  considered  to  be  in  use  (in  addition  to  any 
addresses  already  in  use). 

The  third  schema,  uniqueAddrAlloc,  is  a  claim  that  asserts  that  the  newly  allocated  address  is  not 
in  use  prior  to  the  allocation. 

1.3  Reducing  the  Search  to  Validate  uniqueAddrAlloc 

A  common  analysis  of  NP  specifications  is  to  attempt  to  validate  claims  such  as 
uniqueAddrAlloc.  A  claim  is  valid  if  there  are  no  assigimients  that  satisfy  the  negation  of  the 
claim.  Nitpick  [JD96b],  the  tool  that  I  have  implemented  to  analyze  relational  specifications, 
validates  claims  (within  user  specified  finite  bounds)  using  selective  enumeration  to  solve  the 
negation  of  the  claim.  The  satisfying  assignments  for  the  negation  of  the  claim  are 
counterexamples  of  the  claim  itself. 

Selective  enumeration  recognizes  and  exploits  two  basic  kinds  of  duplications:  partial  assignment 
duplicates  and  permutation  duplicates.  Two  assignments  are  partial  assignment  duplicates  if  they 
share  a  common  mapping  of  values  for  a  subset  of  the  variables  (called  a  partial  assignment)  and 
that  partial  assignment  itself  determines  the  value  of  the  formula. 

The  simplest  way  of  exploiting  partial  assignment  duplicates  is  by  exploiting  derived  variables 
[JD95].  In  many  specifications,  the  values  of  some  variables  are  defined  constructively,  that  is, 
their  value  is  constrained  to  be  equal  to  a  function  of  the  values  of  the  other  variables.  Variables 
with  a  constructive  definition  are  called  derived  variables.  Given  the  bindings  for  the  other 
variables,  the  search  can  directly  construct  the  value  of  a  derived  variable,  rather  than  generating 
many  possible  values  and  testing  each  one.  For  example,  the  formula  used  =  dom  usage  appears 
in  the  schema  Heap.  Therefore,  the  value  of  used  must  be  exactly  the  domain  of  the  value  of 
usage  in  any  counterexample  to  uniqueAddrAlloc.  Assuming  that  usage  is  bound  prior  to  used 
being  generated,  the  value  of  used  can  be  directly  computed. 

As  is  obvious  from  this  example,  selective  enumeration  requires  the  imposition  of  a  variable 
ordering.  Although  any  ordering  is  legal  for  selective  enumeration,  some  orderings  yield  a  much 

greater  reduction  in  the  number  of  assignments  generated  than  indicated  by  other  orderings.^ 

Because  of  the  constraint  a  not  in  used,  the  value  of  a  must  be  an  element  of  the  value  of  used 
for  any  counterexample  to  uniqueAddrAlloc.  Although  this  constraint  does  not  limit  the  possible 
values  of  a  to  a  single  value,  the  constraint  can  be  used  to  limit  the  values  actually  generated 
during  the  search.  Bounded  generation  uses  constraints  from  the  formula  to  limit  the  values 
generated.  Assuming  that  used  is  bound  before  the  value  of  a  is  generated,  bounded  generation 
will  generate  each  element  in  the  set  that  is  the  value  of  used,  instead  of  each  value  in  the  given 
type  Addr. 

A  second  opportunity  for  bounded  generation  exists  in  uniqueAddrAlloc.  The  first  formula  in 
Alloc,  used  <:  usage'  =  usage,  must  be  true  in  any  counterexample  to  uniqueAddrAlloc.  To 


1. The  < :  operator  is  the  domain  restriction  operator.  The  result  of  this  expression  is  a  relation  that  includes 
all  of  the  pairs  in  the  relation  given  as  the  second  argument  whose  first  element  is  contained  in  the  set  given 
as  the  first  argument.  For  the  formal  definition  of  this  and  other  operators,  refer  to  Definition  16  on  page  14. 

2.  The  mechanism  for  choosing  an  ordering  is  beyond  the  scope  of  this  paper.  Nitpick  uses  a  heuristic  to 
choose  the  ordering,  as  computing  an  optimal  ordering  is  factorial  in  the  number  of  variables. 
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simplify  the  implementation,  bounded  generation  does  not  directly  take  advantage  of  this 
constraint;  instead,  this  constraint  implies  a  weaker  constraint,  usage  <=  usage.  Bounded 
generation  uses  this  weaker  constraint  to  limit  both  the  domain  and  range  of  any  value  generated 
for  usage  to  be  subsets  of  the  domain  and  range  of  the  value  of  usage'. 

Derived  variable  analysis  and  bounded  generation  cannot  fully  exploit  all  formulae  within  a 
specification.  If  these  formulae  do  not  depend  on  all  the  variables,  they  still  present  an  opportunity 
for  reducing  the  assignments  to  be  generated.  Short  circuiting  [DJ96]  does  not  reduce  the  number 
of  values  generated  for  any  variables  involved  in  the  formula,  as  would  bounded  generation. 
Instead,  short  circuiting  prevents  generation  of  values  for  any  subsequent  variables  when  the 
partial  assignment  cannot  satisfy  the  formula. 

An  example  of  short  circuiting  can  be  found  for  the  constraint  on  usage  and  usage'  that  initiated 
the  second  bounded  generation  example.  Although  bounded  generation  will  guarantee  that  . 
dom  usage  <=  dom  usage'  and  ran  usage  <=  ran  usage',  this  constraint  does  not  guarantee  that 
usage  <=  usage'.  Once  usage  and  usage'  have  both  been  generated,  short  circuiting  evaluates 
the  constraint  usage  <=  usage'  for  the  resulting  partial  assignment.  Short  circuiting  will 
terminate  the  current  path  of  the  search  for  any  partial  assignments  not  satisfying  the  constraint. 
Similarly,  once  usage,  usage',  and  used  have  been  generated,  short  circuiting  will  check  the  full 
constraint, 

used  <:  usage'  =  usage.  By  utilizing  all  three  techiuques,  selective  enumeration  can  ehminate 
all  partial  assignment  duplicates  available  with  the  selected  ordering. 

The  second  form  of  duphcation  is  called  permutation  duplication.  Because  each  element  in  a 
given  type  is  unstructured,  exchanging  a  pair  of  elements  throughout  an  assignment  does  not 
change  the  interpretation  of  the  formula  for  that  assignment.  Isomorph  elimination  [JJD96;JJD98] 

prevents  the  generation  of  most  values  that  are  permutations  of  other  values  already  generated. 

As  an  example  of  isomorph  elimination,  consider  the  values  generated  for  usage'.  If  Addr  and 
Value  are  limited  to  three  elements  apiece,  it  is  necessary  to  generate  64  (#domain*“'8®+i)  values 
for  the  partial  function  usage'  without  isomorph  elimination.  With  isomorph  ehmination,  on  the 
other  hand,  only  the  following  seven  values  need  be  generated: 

usage'  =  0 
usage'  =  { (ao,  vq)  } 
usage'  =  { (ao,  vq),  (ai.Vi)  } 
usage'  =  { (ao.  vo),  (aj,  vo)  } 
usage'  =  { (ao,  vo),  (aj,  vi),  (aj,  Vz)  } 
usage'  =  { (ao.  vo),  (aj,  Vo),  (aj,  Vj)  } 
usage'  =  { (ao,  vo),  (aj,  vo),  (aj,  Vo)  } 

The  result  of  this  reduced  search  is  illustrated  in  Figure  2  and  Figure  3.  Figure  2  demonstrates  the 
search  until  the  first  counterexample  is  found.  When  the  number  of  elements  in  Addr  and  Value 
are  limited  to  three  apiece,  derived  variables  and  bounded  generation  reduce  the  search  to  find  the 
first  counterexample  from  13,851  assignments  to  just  3.  Figure  3  expands  the  tree  for  one  more 
value  of  usage',  exhibiting  the  further  advantages  of  short  circuiting  and  isomorph  elimination. 


3.  The  current  implementation  of  isomorph  elimination  does  not  consider  all  possible  permutations.  In  par¬ 
ticular,  only  products  of  selected  single  permutations  are  considered. 
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Figure  2:  The  search  tree  for  finding  a  counterexample  to  the  claim  uniqueAddrAlloc.  The  variables  used’ 
and  used  are  derived;  their  values  can  be  directly  computed  from  the  earlier  assignments.  Bounded 
generation  limits  the  domain  and  range  of  the  values  generated  for  usage  to  a  subset  of  the  domain  and 
range  used  in  usage'.  Similarly,  bounded  generation  limits  the  values  considered  for  a  to  the  elements  in  the 
value  of  used.  The  first  two  paths  down  the  search  tree  result  in  used  being  empty,  leaving  no  possible 
values  for  a.  The  first  counterexample  discovered  is  shown  in  a  heavier  box. 


Figure  3:  Continuation  of  the  search  tree  from  Figure  2.  The  satisfying  assignments  are  shown  in  the  heavier  boxes.  Isomorph  elimination  generated  {ao->Vo,ai“>Vi}  as  the  next 
value  for  usage  because  all  other  single  edge  values  are  isomorphic  to  the  one  already  generated  in  Figure  2({ao->Vo}).  Short  circuiting  truncates  the  search  for  the  rightmost  two 
values  generated  for  usage,  as  these  partial  assignments  do  not  satisfy  the  requirements  of  alloc.  In  particular,  these  partial  assignments  do  not  satisfy  the  formula 

usage  <=  usage',  which  is  derived  from  used  <:  usage'  =  usage. 
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1.4  Related  Work 

The  model-generation  community  [Zha96;  ZZ95;  Sla94]  has  addressed  a  problem  that  is  similar  to 
the  relational  formula  satisfaction  problem  addressed  by  selective  enumeration.  A  model-genera¬ 
tion  tool  searches  for  a  satisfying  assignment,  or  model,  for  a  formula.  The  logic  supported  by  the 
model-generation  tools  varies  from  the  logic  supported  by  NP.  Variables  are  allowed  to  be  arbi¬ 
trary  arity  functions  in  most  of  these  tools,  whereas  NP  directly  supports  only  unary  functions.  (Ar¬ 
bitrary  arity  functions  can  be  expressed  in  NP  through  careful  encoding,  but  the  resultant  formula 
is  hard  to  understand  and  selective  enumeration  is  particularly  inefficient  at  analyzing  such  formu¬ 
lae.)  NP,  on  the  other  hand,  adds  support  for  transitive  closure,  which  is  difficult  or  impossible  to 
express  generally  in  the  model-generation  languages.  There  is  also  a  difference  in  apparent  goals: 
selective  enumeration  is  best  suited  for  solving  formulae  with  several  variables  using  a  relatively 
small  scope,  whereas  the  model  generators  appear  targeted  towards  formulae  with  few  variables 
(frequently  one)  using  a  larger  scope. 

Despite  these  differences,  some  of  the  model  generators  use  an  approach  similar  to  the  one  used  in 
selective  enumeration.  Zhang  [Zha96],  in  particular,  used  a  similar  set  of  reduction  techniques  in 
FALCON.  FALCON  includes  a  simpler  form  of  isomorph  elimination,  a  direct  equivalent  to  de¬ 
rived-variable  construction  and  a  backtracking  feature  that  is  similar  to  short  circuiting.  Slaney 
[Sla94]  also  uses  a  backtracking  approach  in  Finder;  he  achieves  reductions  similar  to  those  gained 
with  bounded  generation  by  separating  the  enumeration  of  functions  into  separate  boolean  vari¬ 
ables,  each  representing  a  single  maplet. 

Jipsen’s  approach  [Jip92]  finds  a  Boolean  algebra  with  operators  (BAO)  that  satisfies  a  set  of  first- 
order  equations.  His  approach  does  not  require  finite  bounds,  and  thus  can  be  used  as  a  true  verifier. 
As  the  relational  calculus  can  be  embedded  in  BAO,  this  approach  will  also  solve  relational  for¬ 
mulae.  However,  there  are  a  number  of  difficulties  with  Jipsen’s  work.  His  approach  has  never 
been  proved  complete  —  it  may  never  terminate  for  an  unsatisfiable  system  of  equations.  As  tran¬ 
sitive  closure  is  not  supported  within  BAO,  most  of  our  specifications  could  not  be  expressed  in 
full  generality.  No  experimental  results  are  provided,  so  it  is  not  possible  to  compare  his  approach 
to  selective  enumeration,  even  for  the  finite  domain. 

The  most  general  related  problem  is  the  well  known  boolean-satisfiabihty  problem.  Although  the 
problem  itself  is  NP-complete,  researchers  have  taken  two  major  approaches  to  achieve  an  effec¬ 
tive  solution  in  realistic  time  for  many  formulae.  One  approach,  found  in  [SLM92]  among  others, 
provides  an  unsound  solver,  which  may  fail  to  return  a  solution  even  if  one  exists.  In  the  other 
widespread  approach,  a  structure  or  algorithm  provides  significantly  reduced  exponential  growth 
for  common  formulae,  although  the  worst-case  performance  may  lag  even  the  most  naive  ap¬ 
proach.  Binary  decision  diagrams  (or  BDDs)  [Bry92]  are  a  popular  structure  that  provides  this  gen¬ 
erally  reduced  exponential  growth. 

There  is  a  straightforward  translation  from  relational  formulae  to  boolean  formulae,  so  any  of  the 
boolean  satisfiability  approaches  can  be  applied  to  solve  any  relational  formulae.  This  conversion 
loses  much  of  the  higher  level  semantics  of  the  relational  formulae.  Selective  enumeration  uses 
these  semantics  to  produce  its  reductions.  As  given  in  [DJJ96],  translating  a  relational  formula  into 
the  corresponding  boolean  one  and  solving  the  boolean  formula  using  BDDs  required  approxi¬ 
mately  the  same  time  as  solving  the  relational  formula  directly  using  selective  enumeration.  The 
introduction  of  bounded  generation  and  an  improved  isomorph-ehmination  technique  has  moved 
the  balance  significantly  towards  selective  enumeration.  Although  it  is  possible  that  further  effort 
on  the  BDD  version  could  similarly  reduce  the  time  required,  I  believe  that  the  additional  semantics 
available  in  the  relational  formula  will  give  an  advantage  to  selective  enumeration. 

Solving  a  relational  formula  could  be  structured  as  a  constraint  satisfaction  problem.  Traditional 
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constraint  satisfaction  approaches  [Kum92;  Mac92],  used  to  solve  problems  such  as  shape  recog¬ 
nition  [Wal75]  or  job  shop  scheduling  [SF91],  support  only  a  much  more  limited  constraint  lan¬ 
guage.  Finite  constraint  satisfaction,  the  area  most  similar  to  selective  enumeration,  allows  only  a 
restricted  subset  of  Horn  clauses  to  express  the  constraints.  For  most  of  the  existing  constraint  sat¬ 
isfaction  algorithms,  all  constraints  must  be  binary  (involve  no  more  than  two  variables).  Con¬ 
straint  satisfaction  algorithms  also  typically  require  a  complete  enumeration  of  possible  values  for 
each  variable,  which  can  be  prohibitively  expensive  for  the  relation-typed  variables  commonly 
found  in  NP  specifications. 

There  are  significant  similarities  between  the  approaches  taken  in  constraint  satisfaction  and  the 
approach  taken  in  selective  enumeration.  Backtracking  in  constraint  satisfaction  is  a  direct  equiv¬ 
alent  of  short  circuiting  in  selective  enumeration,  requiring  the  same  care  in  selecting  a  variable 
ordering.  The  standard  constraint  propagation  algorithms,  including  arc  consistency  and  k-consis- 
tency,  are  strongly  reminiscent  of  bounded  generation.  They  are  hmited,  however,  to  the  weaker 
constraint  language. 
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2.  Basic  Definitions 

This  section  develops  the  basic  terminology  for  defining  selective  enumeration  precisely.  In  the 
following  sections,  I  use  this  terminology  to  define  each  technique  that  implements  selective 
enumeration.  I  also  prove  the  soundness  of  each  of  these  techniques. 

This  section  begins  by  defining  the  basic  concepts  underlying  any  generate-and-test  search.  From 
these,  I  develop  precise  definitions  of  the  generators  and  duplications  that  are  the  essence  of 
selective  enumeration.  A  formal  definition  of  soundness  follows  naturally  from  these  definitions. 

2.1  Values  and  Variables 

The  search  chooses  values  constructed  from  the  finite  universe  TI  of  atomic  elements.  Each 
element  within  Tl  is  itself  unstructured.  In  this  initial  analysis,  I  ignore  the  type  distinctions 
between  elements.  Therefore,  for  the  simple  alloc  example  described  in  the  prior  section,  TI  is  the 
union  of  the  Addr  and  Valu©  sets.  In  Section  6, 1  will  re-introduce  given  types  to  differentiate  the 
elements  of  TI. 

There  are  three  kinds  of  values:  (1)  atomic  elements  of  TI,  (2)  sets  of  atomic  elements,  or  (3) 
binary  relations  on  the  atomic  elements. 

Definition  1:  Valuegcaiar  =  TI 
Valueset=  PTI 
Valuerei=  P(TIxTI) 

Value  =  ValuCgcalar  ^  ValuCset  U  ValuCrej 

Each  claim  or  schema  in  a  specification  defines  a  set  of  variables.  I  divide  the  complete  collection 
of  variables  into  three  sets  based  on  the  kind  of  value  they  denote:  Vars<.aiar>  Var^et,  and  Var^ei. 
The  intuitive  relationship  between  these  variables  and  the  corresponding  values  will  be 
maintained;  a  variable  in  Vaiscaiar  will  only  be  bound  to  an  atomic  element  whereas  a  variable  in 
Varget  will  only  be  bound  to  a  set  of  atomic  elements. 

Definition  2:  Variable  =  Var^caiar  ^  Var^et  U  Var^ei 

Definition  3:  N  =  iVariablel 

For  the  claim  uniqueAddrAlloc  from  Figure  1,  N  is  5  and  the  variables  are 
Vafgcalar  ~  ®  } 

Var^et  =  {  used,  used'  } 

Varrei  =  {  usage,  usage'  } 

To  perform  the  search,  the  variables  need  to  be  ordered.  This  order  corresponds  to  the  ordering 
used  in  the  search  tree. 

Definition  4:  Ord:  Variable  — >  1. .  .N 

For  the  uniqueAddrAlloc  example,  I  will  use  the  ordering  from  the  search  illustrated  in  Figure  2 
and  Figure  3: 
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Ord  =  {  usage'H^i,  used'H^  2,  usagei-^3,  used^4,  ai-^5  } 

A  useful  construct  is  the  ordering-based  subsets  of  variables. 

Definition  5:  Var;  =  {  V  I  1  <  Ord(v)  <  i  } 

By  convention,  Varo  is  the  empty  set  of  variables.  Using  the  Ord  for  uniqueAddrAlloc  from  the 
prior  paragraph,  the  value  for  Var3  is 

Var3  =  {  usage’,  used',  usage  } 

2.2  Language 

The  next  basic  element  is  the  language  used  to  express  the  formula  itself.  The  NP  language  is 
intended  for  human  consumption;  structures  such  as  schemas  offer  no  additional  expressive 
power.  Nitpick  translates  each  specification  from  the  NP  language  into  a  simpler  formula 
language.  To  further  simplify  this  analysis,  I  define  only  a  subset  of  the  formula  language  here. 
Extending  this  analysis  to  include  the  entire  formula  language  is  a  straightforward  exercise. 

The  alphabet  of  the  simplified  formula  language  includes  the  variables  and  the  operators. 

91  =  Variable  u  {  {, },  dom,  ran,  func,  U,  &,  <:,  in,  =,  <=,  Un,  (, ),  and,  or,  not } 

The  foundation  of  the  language  is  the  terms.  Terms  describe  all  of  the  values  that  can  be 
constructed  using  this  language.  As  with  other  items,  terms  are  divided  into  three  categories: 
Termscaiar.  Termset,  and  Termjei. 

Definition  6:  Term  =  Term^caiar  Termset  U  Termrei,  where 

Termscaiar»  Termset.  and  Term^ei  are  defined  by  the  BNF  grammars 

Termscalar  ^arscalar 

Termset  ::=  Varset  I  (Termscalar)  >  (  }'  Un  I 

(Termset  U  Termset)  I  ( Termset  &  Termset )  I 
( Termset  \  Termset)  I  Termrei  (Termscalar)  < 
dom  Termrei  I  ran  Termrei 

Termrei  Varrei  I  (Termget  <:  Termrei) 

As  an  overview,  Un  is  the  universe  of  possible  values  of  the  appropriate  type,  &  is  the  intersection 
operator,  and  Termrei  (Termscalar)  gives  the  relational  image.  The  complete,  formal  definitions  of 
the  operators  are  given  in  the  definitions  starting  on  page  13. 

Atomic  formulae  are  constructed  from  terms. 

Definition  7:  AtomicFormula,  the  set  of  atomic  formulae,  is  defined  by  the  BNF  grammar 

AtomicFormula  ::=  Termscalar  in  Termset  I  Termscalar  =  Termscalar  • 

Termset  =  Termset  I  Termset  <=  Termset  I 

Termrei  =  Termrei  I  func  Termrei 

Finally,  Wffs  in  the  formula  language  are  built  fi'om  atomic  formulae. 
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Definition  8:  Wff,  the  set  of  well-formed  formulae,  is  defined  by  the  BNF  grammar 

Wff  ::=  AtoraicFormula  I  not  Wff  — >  Wff  I  ( Wff  and  Wff)  I  ( Wff  or  Wff) 

For  any  formula  in  the  language,  there  is  a  unique  derivation  for  that  formula  given  by  this 
grammar.  The  claim  uniqueAddrAlloc  from  Figure  1,  which  is  written  in  NP,  is  translated  into  the 
formula  language  as 

Formula  1 :  (  not  ( (dom  usage  =  used  and  dom  usage'  =  used') 
and  ( func  usage  and  func  usage' ) )  or 

(not  ( (dom  usage  =  used  and  dom  usage'  =  used')  and 

( (used  <:  usage')  =  usage  and  used'  =  (used  U  {a}) ) ) 

or  not  a  in  used) ) 

For  the  formula  language,  the  set  of  free  variables  for  a  formula  is  exactly  the  set  of  variables  used 
in  the  formula. 

Definition  9:  The  free  variables  of  a  term,  FV(T) :  Term  ^  PVariable,  is  defined  as 

if  T  is  V  where  V  e  Variable 
{V} 

if  T  is  { Xi  }  where  Xj  e  Termscaiar 
FV(Xi) 

if  X  is  {  } 

0 

if  X  is  Un 

0 

if  X  is  ( Xi  op  X2  )  where  Xj  ,X2  6  Term  and  op  is  one  of  U  ,  &,  <:,  or  \ 

FV(Xi)  U  FV(X2) 

if  X  is  op  Xi  where  Xj  e  Termrei  and  op  is  either  dom  or  ran 
FV(Xi) 

if  X  is  Xi  (X2)  where  Xj  g  Termrei,  X2  G  Term^caiar 
FV(Xi)UFV(X2) 

Definition  10:  The  free  variables  of  a  formula,  FV((|))  :  Wff  — >  PVariable,  is  defined  as 

if  (j)  is  Xi  op  X2  where  Xi,X2  e  Term  and  op  is  either  =,  in,  or  <= 

FV(Xi)uFV(X2) 

if  (|)  is  func  Xi  where  Xj  g  Termrej 
FV(Xi) 

if  ([)  is  ( (t)i  op  (|)2 )  where  (|)i  ,(|)2  G  Wff  and  op  is  either  and  or  or 
FV((|)i)uFV((j)2) 

if  (|)  is  not  (|)i  where  01  g  Wff 
FV(0i) 
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2.3  Assignments 

A  generate-and-test  search  solves  a  formula  by  generating  assignments,  then  interpreting  the 
formula  for  each  assignment  generated.  An  assignment  is  a  mapping  from  variables  to  appropriate 
values.  An  assignment  can  be  a  full  assignment,  mapping  all  variables  to  appropriate  values,  or  it 
can  be  a  partial  assignment,  mapping  only  a  subset  of  the  variables  to  values. 

Definition  11:  S  :  Variable  Value  =  {  V  X  I 

V  s  X  s  Value^Qjjgj.  a 

V  e  Var^et  X  e  Value^et 

V  e  Var^ei  =>  X  E  Valuejei) 

S,  therefore,  is  the  set  of  all  well-typed  assignments.  A  useful  decomposition  of  S  is  based  on  what 
variables  are  actually  mapped. 

Definition  12:  Sj  =  {  S  G  S  I  dom  S  =  Varj  } 

Sj  is  the  set  of  all  assignments  that  map  exactly  the  first  i  variables,  as  defined  by  Ord.  The 
assignments  on  the  i*  level  of  the  search  tree  are  drawn  fi:om  Sj.  Two  such  sets  are  of  particular 
interest:  Sq  contains  only  the  empty  assignment  and  contains  all  full  assignments. 

2.4  Interpretation 

A  precise  semantics  of  the  formula  language  is  needed  to  analyze  the  assignments.  A  formula  is 
interpreted  as  true,  false  or  unknown  for  any  given  assignment.  Intuitively,  the  interpretation  of  a 
formula  may  be  unknown  if  any  of  the  free  variables  of  the  formula  are  not  mapped  by  the 
assignment.  Otherwise,  each  variable  in  the  formula  is  replaced  by  the  corresponding  value  from 
the  assignment  and  the  formula  is  evaluated  using  the  usual  semantics  for  relational  formulae. 

The  interpretation  of  a  formula  is  dependent  on  the  interpretation  of  each  term.  The  interpretation 
of  a  term  in  the  formula  language  is  given  as  S  (T ) .  s  is  the  union  of  three  functions:  Sscaiar>  Sget. 
andirei- 

Definition  13:  i  =  S^calar  U  ^  Srei 

Definition  14:  SjcaiarC'^)  :  Terniscaiar  Value^caiar  U  {  unknown  }  = 

if  T  G  Vargggjjj. 

s(T)  if  X  G  dom  s 

unknown  otherwise 

Definition  15:  SsetfT)  :  Terniget  — »  ValuCget  u  {  unknown  }  = 

if  X  is  V  where  V  g  Var^et 

s(v)  if  V  G  dom  s 

unknown  otherwise 

if  X  is  { Xi  }  where  Xj  g  Terniscaiar  _ 

{  X  I  X  =  }  if  Sscaiar(Ti)  ^  unknown 

unknown  otherwise 
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if  X  is  { } 

0 

if  X  is  Un 
TJ 

if  X  is  Xj  U  X2  where  Xi,X2  e  Termget 

{  X  I  X  E  Sset(Xi)  V  if  Sset(l^i)  ^  unknowH  A 

X  e  Sset(X2)  }  Sset(X2)  ^  Unknown 

unknown  otherwise 

if  X  is  Xi  &  X2  where  Xi,X2  6  Termget 

{  X  I X  e  ise^('Ci)  A  if  SsetC^i)  *■  Unknown  a 

X  G  Sset('t2)  }  SsetC^a)  ^  unknown 

unknown  otherwise 

if  X  is  Xj  \  X2  where  Xi,X2  G  Termsgt 

{  X  I  X  E  Sse^(Xi)  A  if  Sset(Ti)  ^  unknown  A 

X  G  Sset(T2)  }  Sset(X2)  ^  unknown 

unknown  otherwise 

if  X  is  dom  Xi  where  Xj  e  Term^ei 

{  X  I  3y.(x,y)  e  irei(Xi)}  if  irei('I^i)  *  unknown 

unknown  otherwise 

if  X  is  ran  Xj  where  Xj  s  Termrei 

{  y  l3x.(x,y)ESrei(Xi)}  if  Srei(Xi)  ^  unknown 

unknown  otherwise 

if  X  is  Xi  (X2)  where  Xj  E  Term^ei  a_X2  e  Term^caiar 

{  y  I  (Sscalar('^2).y)  G  if  Srel(Xi)  ^  UnknOWn  A 

Srei(Xi)  }  Sscaiar('C2)  ^  Unknown 

unknown  otherwise 

Definition  16:  Srei(X)  :  Termyei  — ^  Valuerei  U  {  unknown  }  = 

ifXEVarrei 

s(x)  if  X  G  dom  s 

unknown  otherwise 

if  X  is  Xi  <:  X2  v^ere  Xi  s  Termset  a  X2  e  Term^ei 

{ (x,y)  I X  E  Sget^Xi)  A  if  Sset(Xi)  ^  Unknown  a 

(x,y)  E  Srei(X2)  }  Srei(X2)  ^  Unknown 

unknown  otherwise 

The  three  S  fuiwtions  derive  from  directly  S.  Therefore,  each  assignment  S  induces  a 
corresponding  s  function. 

The  interpretatioi^of  a  formula  for  an  assignment  is  given  using  l=(|)[s],  where  s  e  S  and  <1)  e  Wff.  It 
follows  from  the  s  functions. 

Definition  17:  l=(])[s] :  S  X  Wff  — »  { true,  false,  unknown  }  = 
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if  (|)  is  Ti  =  X; 
true 

false 

unknown 

if  (|)  is  Xi  in  X 
true 

false 

unknown 

if  (j)  is  Xi  =  X2 
true 

false 

unknown 

if  (|)  is  Xi  <=  1 
true 

false 


unknown 

if  (j)  is  Xj  —  X2 
true 

false 

unknown 


,  where  Xi,X2  e  Termscaiar 

if  SscakrC'Ti)  ^  Unknown  a 
Sscaiar('C2)  ^  unknown  A 

®scalar("^l)  ~  ®scalar(^2) 
if  Sscaiar('Ci)  ^  unknown  A 

Sscaiar('C2)  ^  u^nknown  A 

®scalar("^l)  ^  ®scalar(^2) 

Otherwise 


2  where  Xi  G  Termscaj^i-  and  X2  6  Termsgt 

if  Sscaiar('Ci)  ^  unknown  A 
SsetC'^i)  ^  unknown  a 

®scalar("^l)  ^  ®set(^2) 
if  Ssca_iar('Ci)  ^  unknown  A 
8861(^2)  ^  unknown  a 

®scalar(”^l)  ^  ®set  (^^2) 
Otherwise 


where  Xi,X2  G  Term^et 

if  Sset(Xi)  unknown  a 
Sset('^2)  ^  unknown  a 

®set("^l)  =  ®set("^2) 

if  iset('l^i)  ^  unknown  a 
®set('^2)  ^  unknown  a 
SsetC^^l)  ^  Sset(^2) 
otherwise 


'2  where  Xi,X2  e  Termset 

if  Sset('I'i)  ^  unknown  a 
®set('f'2)  *  unknown  a 

®set(*^l)  C  ®set(”^2) 

if  ®set('^i)  ^  unknown  a 
®set('C2)  ^  unknown  a 

®set(^l)  £set(^2)  ^ 
®set(^l)  ^  ®set('^2) 

otherwise 


where  Xi,X2  g  Termjei 

if  ®rei('Ci)  unknown  a 
®rei(X2)  ^  unknown  a 

®rel('^l)  ~  ®rel(’^2) 

if  ®set('Ti)  ^  unknown  a 
®rei(X2)  unknown  a 

®rel(^l)  ^  ®rel(^2) 

otherwise 
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if(l)  is  func  Tj  where  Xj  e  Termrej 

true  if  SfeiCXi)  ^  unknown  a 

Vx,y,ze'U. 

((X,y)  E  Srel(Xi)  A 

(X,Z)  €  S,ei(Xi)) 

^  y  —  2 

false  if  Si.ej(Xi)  ^  unknown  a 


unknown 

Vx,y  6  Tl.  3ze  TI.y  a 

((x,y)  E  S,ei(Xi)  A 
(X,Z)  E  Srei(Xi)) 

otherwise 

if  (|)  is  ( (jli  and  ([)2  )  where  (|)i  ,(|)2 

,  EWff 

true 

if  t=(t)i[s]  =  true  a  l=(t)2[s]  =  true 

false 

if  N(|)i[s]  =  false  V  N(j)2[s]  =  false 

unknown 

otherwise 

if  (j)  is  ( (jli  or  ({>2 )  where  (|)i,(j)2  g  Wff 

true 

if  Ntjljs]  =  true  v  N(j)2[s]  =  true 

false 

if  l=(l)i[s]  =  false  a  t=(|)2[s]  =  false 

unknown 

otherwise 

if  (|)  is  not  (|)i  where  (jli  e  Wff 
true 

if  N(|)i[s]  =  true 

false 

if  l=(j)i[s]  =  false 

unknown 

otherwise 

As  an  example,  consider  the  negation  of  Formula  1  (derived  from  uniqueAddrAlloc)  given  by 
Formula  2: 

Formula  2:  (j)  =  ( ( (dom  usage  =  used  and  dom  usage'  =  used')  and 
( func  usage  and  func  usage' ) )  and 
( ( (used  <:  usage')  =  usage  and  used'  =  (used  U  {a}) ) 
and  a  in  used) ) 

This  formula  can  be  interpreted  using  any  vahd  assignment.  Assuming  ao  and  Vq  are  elements  of 
TI,  three  possible  interpretations  are 

!=(})[{ }]  =  unknown 

b(|)[{  usage'H^{  (ao,  vq)  },  used'H^.0  }]  =  false 

N(l)[{  usage'^{  (ao,  vo) },  used'^^fao}, 

usagei->{  (ao,  vq)  },  usedi-^lao),  ai-^ao }]  =  true 

If  the  assignment  maps  all  of  the  free  variables  in  the  formula,  the  interpretation  will  be  either  true 
or  false,  but  not  unknown. 

Theorem  1:  Vss  S,X  s  Term.FV(X)  c  dom  S  =>  s(X)  ^  unknown 

Proof:  By  structural  induction. 
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If  T  is  V  where  V  e  Variable 

By  definition,  FV(T)  =  {  v  } 

As  FV(T)  c  dom  s,  V  e  dom  s 

By  definition  of  s,  v  £  dom  s  ^  s(T)  ^  unknown. 

if  I  is  Ti  U  T2  where  Tj  e  Termset 

By  definition  of  FV,  FV(Ti)  c  FV(T)  and  FV(T2)  c  FV(T) 

Therefore,  FV(Ti)  c  dom  s  and  FV(T2)  q  dom  s 
Therefore,  by  induction,  s(Ti)  ^  unknown  and  8(12)  ^  unknown 
By  definition  of  i,  s(T)  ^  unknown 

Other  productions  follow  similarly  ■ 

Theorem  2:  VS£  S,(j)  E  wff.FV((|))  c  dom  S  =>  t=(l)[s]  ^  unknown 
Proof:  By  structural  induction. 

If  (j)  is  Xj  in  T2  where  Xj  £  Termscaja^  and  X2  £  Termset 

By  definition  of  FV,  FV(Xi)  c  FV((|))  and  FV(X2)  c  FVCt])) 

Therefore,  FV(Xi)  c  dom  s  and  FV(X2)  c  dom  s 

Therefore,  by  Theorem  1,  s(Xi)  ^  unknown  and  s(X2)  ^  unknown 

By  definition  of  l=,  N<t)[s]  ^  unknown 

If  (|)  is  ((|)i  and  (t)2)  where  (|)i  ,(j)2  g  Wff 

By  definition  of  FV,  FV((|)i)  c  FV((1))  and  FV((t)2)  c  FV((t)) 

Therefore,  FV((t)i)  c  dom  S  and  FV((1)2)  c  dom  s 

Therefore,  by  induction,  t=(j)i  [s]  ^  unknown  and  l=(|)2[s]  ^  unknown 

By  definition  of  N,  N(j)[s]  ^  unknown 

Other  productions  follow  similarly  ■ 

Some  formulae  are  logically  implied  by  other  formulae.  The  notation  (|)l=(|)'  indicates  that  (j) 
logically  implies  (|)'. 

Definition  18:  (|)t=(|)'  iff  Vs  £  S^.  t^(|)[s]  =  true  l=(l)'[s]  =  true 
Given  the  formula  (|)  used  in  the  preceding  example,  (|)l=a  in  used. 

2.5  Generators 

The  key  to  any  generate-and-test  search  is  the  ability  to  generate  assignments.  A  special  function, 
called  a  generator,  generates  assignments  for  level  i  of  the  search  tree  given  an  assigmnent  from 
level  i-1 .  A  generator  for  level  i  adds  a  value  for  the  variable  to  the  initial  assignment,  without 
changing  the  mapping  of  any  other  variable  in  that  assignment. 

Definition  19:  '^A  function  Qj :  Sj.^  — >  PSj  is  a  generator  for  level  i  of  the  search  iff 
Vs  £  Sj.i  .Vs'  £  gj(S).  Varj.i  <  S'  =  S 
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A  generator  function  expands  a  single  assignment  in  the  search  tree  into  the  complete  set  of 
assignments  descending  immediately  from  that  value. 

An  aggregate  generator  can  also  be  defined  for  any  generator.  An  aggregate  generator  generates  a 
complete  level  in  the  search  tree,  given  the  prior  level  of  the  search  tree. 

Definition  20:  A  function  Gj :  PSj.i  PSj  is  an  aggregate  generator  for  a  level  i  generator  Qj  iff 

VQj.i  c  Sj.i.  G|(Qj.i)  =  u  gi(q) 
q  £  Qi.i 

A  trivial  generator,  corresponding  to  an  exhaustive-enumeration  search,  can  be  associated  with 
any  variable  using  the  exhaustive-enumeration  generator  gO. 

Definition  21:  The  exhaustive-enumeration  generator  gO  for  level  i  is  defined  as 
if  Vj  €  Vargcalar 

gO(S)  =  {  s'  I  3x  G  ValuCscalar-  S'  =  S  U  {  Vj  ^  X  }  } 

ifVjGVarset 

g0(s)  =  {  s'  I  3x  G  ValuBsef  s'  =  S  U  {  V|  1-^  X  }  } 

if  V|  G  Varrei 

g0(s)  =  {  s'  I  3x  G  ValuCrei.  S'  =  S  U  {  V|  i-h>  X  }  } 

A  search  requires  a  collection  of  generators,  one  associated  with  each  variable.  A  function 
mapping  each  variable  to  an  appropriate  generator  is  called  a  generator  suite. 

Definition  22:  A  function  y :  Variable  (S  PS)  is  a  generator  suite  iff 
Vv  G  Variable.  Ord(v)  =  i  =>  y(v)  is  a  generator  for  level  i. 

2.6  Duplications 

The  essence  of  selective  enumeration  is  reducing  the  number  of  cases  generated  by  removing 
duplicates.  A  duplication  partitions  the  set  of  full  assignments  into  equivalence  classes  for  some 
particular  formula  (]).  The  only  requirement  for  these  equivalence  classes  is  that  all  assignments  in 
any  equivalence  class  give  the  same  interpretation  to  (|). 

Definition  23:  A  set  of  sets  d  ((])):  PPS^  is  a  duplication  for  the  formula  (()  iff 
d  ((|))  is  a  partitioning  of  Sn  and 

for  the  equivalence  relation  =d((t))  induced  by  d(<l)),  Vs,s'  G  Sn.S  s'  =>  l=(])[s]  =  l=(|)[s'] 

Two  obvious  duplications  are  uninteresting  for  the  purposes  of  selective  enumeration.  The  first, 
which  I  call  do,  places  each  assignment  in  its  own  equivalence  class.  This  corresponds  to  the 
exhaustive-enumeration  search.  The  second  obvious  duphcation,  which  I  call  dj^,  divides  the 


4.  I  use  the  domain  restriction  operator  <  from  Z  here  as  well  as  in  later  definitions  and  theorems.  For  those 
readers  unfamiliar  with  Z,  the  <  operator  yields  the  relation  that  is  the  subset  of  the  second  operand 
restricted  to  those  pairs  whose  first  element  is  a  member  of  the  first  operand.  More  precisely, 
s  <  r  =  {  (x,y)  I  (x,y)  Gr  AXGS}.In  practice,  this  is  used  to  select  a  subset  of  a  relation  that  is  meaning¬ 
ful  for  a  more  restricted  domain. 
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assignments  into  two  classes,  ones  that  satisfy  (|)  and  ones  that  do  not  satisfy  ([).  Although  this 
would  be  the  ideal  duplication,  it  is  not  directly  computable  and  therefore  of  no  great  benefit. 

These  two  duplications  can  be  defined  in  terms  of  the  corresponding  equivalence  relations. 

Vs, s'  £  Sn.  s  =0  s'  <=>  s  =  s' 

Vs, s'  £  Sn-  s  s'  l=(|)[s]  =  N(|)[s‘] 

The  duplications  described  in  the  previous  section  for  solving  uniqueAddr Alloc  can  also  be 
defined  in  this  manner.  For  the  reduction  involving  a  in  used,  every  assignment  for  which  a  was 
not  an  element  of  used  was  placed  into  a  single  equivalence  class,  with  each  other  assignment 
defining  its  own  equivalence  class. 

Vs,S  £  Sfj.  S  ~a  ja  used  ®  ^ 

((Na  in  used[s]  =  false  a  l=a  in  used[s']  =  false)  v  s  =  s') 

Other  bounded  generation  duplications,  such  as  the  constraint  on  usage  and  usage',  behave 
similarly.  Each  duplication  groups  known  false  assignments  together  in  a  single  equivalence  class, 
placing  all  other  assignment  into  individual  equivalence  classes. 

This  form  of  equivalence  relation  can  be  generalized  to  support  any  partial  assignment 
duplication.  Each  partial  assignment  duplication  has  a  related  formula  (|)'  that  is  implied  by  the 
target  formula  itself.  It  is  convenient  to  define  a  special  notation  to  describe  partial  assignment 
duplications. 

Definition  24;  An  equivalence  relation  s=PAd((t),(t)')  is  a  partial  assignment  duplicate  equivalence 

relation  iff  (|)t=(()'  a  Vs,s'  £  Sn.s  ~PAd((t),<j)')  s'  <=>  (s  =  s'  v  (l=(t)'[s]  =  false  a  l=(t)'[s']  =  false)) 

This  places  all  assignments  that  fail  to  satisfy  (|)'  into  a  single  equivalence  class  and  each 
assignment  that  satisfies  (|)'  into  its  own  equivalence  class. 

Theorem  3:  The  partitioning  of  Sn  induced  by  a  partial  assignment  duplicate  equivalence  relation 
=T>Ad((|),<|,')  is  a  duplication  for  (|). 

Proof:  To  prove  that  «PAd(<|),(t)')  induces  a  duplication,  it  is  necessary  to  prove  that 

V<l):Wff.Vs,s'  £  Sn.s  =PAd(4,,f )  s'  N(t)[s]  =  N(l)[s'] 

By  the  definition  of  =T>Ad(<|),(|)').  there  are  two  cases  to  consider: 

s  =  s'  and  (l=(t)'[s]  =  false  a  I=(|)'[s']  =  false) 

For  the  first  case,  clearly  N(|)[S]  =  N(|)[s']. 

For  the  second  case,  if  s  £  Sn»  l=(|)[s]  is  either  true  or  false  by  Theorem  2. 

Because  (|)N<|)',  N^[s]  =  true  =>  N(|)'[s]  =  true . 

Therefore,  if  l=(l)'[s]  =  false,  t=(|)[s]  =  false. 

Therefore,  for  the  second  case,  l=(l)[s]  =  false  a  I=(|)[s']  =  false.  ■ 

Definition  25:  PAd((j),(l)')  is  the  partial  assignment  duplication  induced  by  the  partial  assignment 
equivalence  relation  »PAd((|), (!>'))• 
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I  will  not  describe  the  permutation  duplications  until  I  formalize  the  notion  of  permutations  in 
Section  4. 

Duplications  may  also  be  combined,  further  reducing  the  number  of  assignments  that  selective 
enumeration  must  generate  and  test.  To  combine  two  duplications,  the  corresponding  equivalence 
relations  are  combined. 

Definition  26:  ~a  =  ~b  °  ~c  iff 

Vs,s'  e  Sn-S  »a  s'  <^  (  S  s'  V  S  «c  s'  V  (  3s"  G  Sn.(S  «a  S"  A  S"  «a  S')  )  ) 

The  result  of  combining  two  equivalence  relations  is  itself  an  equivalence  relation  that  induces  a 
duplication. 

Lemma  4:  For  any  two  equivalence  relations  ~i,  and  ~c,  =b  °  ~c  is  an  equivalence  relation. 

Proof:  A  relation  must  be  reflexive,  symmetric  and  transitive  to  be  an  equivalence  relation. 

As  =i3  is  reflexive.  Vs  g  S^.s  s. 

Therefore,  Vs  g  S^.s  ~tj„c  s. 

If  s  «b»c  s',  then  either  s  ~b  s'  or  s  s'. 

Assuming  s  “t,  s',  then  by  symmetry  of  ==5,  s'  =b  s. 

Therefore  s'  =boc  s  and  ~b„c  is  symmetric. 

By  the  definition  of  o 

Vs,s',s"  G  Sn.s  =boc  s'  A  s'  =boc  s"=>  s  ==boc  S''.  ■ 

Definition  27:  db(<l))  °  is  the  partitioning  induced  by  =b  o  =c 
Theorem  5:  d  b((l))  o  dc((]))  is  a  duplication. 

Proof:  Let  d  a((j))  =  d  b((|))  °  d  c(<t))  and  =a  =  ~b  °  =c- 

There  are  two  requirements  for  da((j))  to  be  a  duplication: 

it  must  be  a  partitioning  of  Sn  and 

V(|)  G  Wff,Vs,S'  G  Sn.S  »a  s'  =>  N<f)[S]  =  N(f)[S']. 

By  Lemma  4,  =a  is  equivalence  relation  on  S^, 
therefore  da((t))  is  a  partitioning  of  Sn. 

Assume  s,s'  g  Sn  such  that  s  =a  s' 

By  definition,  one  of 

S=bS' 

S“cS' 

3s"  G  Sn.(S  ~a  S"  A  S"  =a  S') 
must  hold. 

If  either  of  the  first  two  possibilities  hold 

then  l=(l)[s]  =  l=(l)[s']  by  definition  of  db((t))  or  dc((])),  respectively. 
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The  third  possibiUty  requires  the  existence  of  a  sequence  of  full  assignments, 

S* ,  S^,  6  Sn-S  A  A  A  ...  A  A  S*'  S' 

where  ... ,  are  either  «b  or  ~c- 

It  is  obvious  by  induction  that  such  a  sequence  must  guarantee  that  t^(t)[s]  =  t=(|)[s'].  ■ 

Combining  two  partial  assignment  dupUcations  in  this  manner  gives  the  same  result  as  the  partial 
assignment  duplication  defined  by  the  formula  built  from  the  conjunction  of  the  two  formulae 
defining  the  original  duplications. 

Theorem  6:  PAd((|),(|)i')  o  PAd((|),(t)2')  =  PAd((l),(([)i'  and  (t)2')). 

Proof:  There  are  two  distinct  requirements  that  must  be  demonstrated  equivalent  for  this 
theorem  to  hold. 

The  first  is  straightforward: 

(t)l=(|)i'  A  (t)t=(|)2'<^  and  (|)2') 

Let  be  the  equivalence  relation  induced  by  PAd((t),<l)i'), 

~2  be  the  equivalence  relation  induced  by  PAd((j),(t)2')  and 
«an(j  be  the  equivalence  relation  induced  by  PAd((t),((|)i'  and  (t)2')) 

The  second  requirement  is  that 

(S«i  S'VS«2  s')  =>S=andS' A 

(  3s  G  Sjsi.(S  ~and  S  A  S  ~and  S )  )  =>  S  ~and  ®  • 

The  second  clause  of  this  requirement  is  a  direct  consequence  of  the  fact  that  =and 
is  an  equivalence  relation. 

For  the  first  clause,  if  s  =  s',  then  s  s',  s  «2  s',  and  s  =and  s'- 
Otherwise,  if  s  =i  s',  then  f(t)i'[s]  =  false  a  N(t)i'[s']  =  false. 

Therefore,  then  N((t)i'  and  <1)2 ')[s]  =  false  a  l=((t)i'  and  (t)2')[s']  =  false 
Therefore,  S  —and  s'. 

Similarly,  if  s  ~2  s',  then  s  —and  s'.  ■ 


2.7  Soundness 

It  is  now  possible  to  define  a  selective  enumeration  search  precisely. 

Definition  28:  A  function  CO  :  Wff  X  GeneratorSuite  PS^  is  a  search  iff 
®((|),Y)  =  {  s  I  N(l)[s]  =  true  A  s  e  Gn(Gn.i(Gn-2(-G2(Gi({0})))))  } 
where  Gj  is  the  aggregate  generator  for  each  Qj  in  y. 

To  be  sound,  a  search  must  guarantee  that  it  will  find  at  least  one  satisfying  assignment  if  any 
exist. 

Definition  29:  A  search  (0((|),Y)  is  sound  iff  (3s  G  Sfj.  I=(|)[s]  =  true)  =>  C0((1),Y)  5^:  0. 
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For  any  duplication  d  (({))  for  a  formula  (|),  selective  enumeration  will  be  sound  if  it  enumerates  at 
least  one  assignment  from  each  satisfying  set  in  d((j)).  A  generator  is  sound  if  the  only  satisfying 
assignments  excluded  are  duplicates  of  other  assignments  generated. 

Definition  30:  A  level  i  generator  Qj  is  sound  for  a  duplication  d((l))  iff 

Vs  e  Sn.I=(1)[s]  =true  =>  3s'  e  SN.Varj  <  s'  g  gj(Var|.i  <i  s)  a  s  s'. 

A  subset  of  Sn  is  said  to  represent  d  ((|))  if  it  includes  at  least  one  assignment  from  each  satisfying 
set  in  d  (<|)).  This  can  be  easily  extended  to  include  subsets  of  S;  that  can  be  used  to  generate  a 
representative  set . 

Definition  31:  A  set  Q|  c  Sj  represents  d  ((j))  iff 

Vs  e  Sn.(  I=(1)[s]  =  true  =>  3  s'  e  Sn.(s  s'  a  Var|  <  s'  g  Q])). 

As  a  base  case,  the  set  containing  only  the  empty  assignment  represents  any  duplication. 

Lemma  7:  {  0  }  c  Sq  represents  any  d((|)). 

Proof:  Proof  by  construction. 

By  definition,  Varo  =  0. 

Therefore,  Vs  g  Sjsj.Varo  <  S  =  0  .  g 

Starting  with  a  representative  set,  a  sound  generator  will  generate  a  representative  set. 

Theorem  8:  If  a  level  i  generator  Qj  with  the  aggregate  generator  Gj  is  sound  for  d  ((j))  then 
VQj.-i  c  Sj.i.Qj.i  represents  d(([))  =>  Gj(Qj.i)  represents  d((j)). 

Proof:  By  contradiction. 

Assume  s  G  SN.f=(l)[s]  =  true  a  Vs'  g  Sn.(s  ~d(^)  s'  =»  Varj  <  s'  g  Gj(Qi.i). 

By  definition  of  represents, 

3s"  G  Si.j.(S  ~d(<j))  S"  A  Varj.1  <  S"  G  Qj-i)). 

Because  l=(l)[s]  =  true  and  gj  is  sound  for  d  (([)), 

3s'"  G  Sn.(S"  s’"  A  Varj  <  S'"  G  G|(Qi.i)). 

By  transitivity,  s  «d(4.)  s'". 

Therefore,  the  assumption  is  contradicted.  g 

A  combination  of  duplications  defines  a  new  collection  of  equivalence  classes  that  completely 
includes  all  of  the  original  equivalence  classes.  Therefore,  a  generator  that  is  sound  for  one 
duplication  is  also  sound  for  that  duplication  in  combination  with  any  other  duplication. 

Theorem  9:  If  a  level  i  generator  gj  is  sound  for  any  duplication  d  i  (([)), 
for  any  duplication  d  2  ((])),  gi  is  sound  for  d  iCij))  o  d  2  ((])). 

Proof:  Let  S  G  SN.f(|)[s]  =  true, 

be  the  equivalence  relation  induced  by  d  1  ((])), 
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~i„2  be  the  equivalence  relation  induced  by  d  i((|))  “  d  2  ((])). 

By  the  definition  of  a  sound  generator, 

3s'  e  S^-Vari  <  s'  e  gi(Varj.i  <3  S)  a  S  ~i  s'. 

By  definition  of  «>,  s  s'  =>  S  ~i„2  s'. 

Therefore,  Qi  is  sound  for  d  i((j))  “  d2((j)).  ■ 

For  a  sound  search,  the  goal  is  a  collection  of  generators  that  can  generate  a  representative  set  of 
full  assignments. 

Lemma  10:  The  result  of  a  search  C0((t),Y)  represents  d  (([))  if  V  Qj  e  ran  y.  Qj  is  sound  for  d  ((])). 
Proof:  By  induction  on  i. 

Hypothesis:  G|(Gj.i(...(Gi({0}))))  represents  d((|))  if  Vgj  e  ran  y.gi  is  sound  for 
d  (([)). 

By  Lemma  7,{0}  represents  d((|)). 

Therefore,  because  gj  is  sound  for  d  ((])),  Gj ( { 0  } )  represents  d (([))  by  Theorem 

8. 

By  induction,  Gj.i(Gj.2(...(Gi({0}))))  represents  d  ((|)). 

Because  gj  is  sound  for  d((|)), 

Gj(Gj.i (...(Gi ({0}))))  represents  d (([))  by  Theorem  8.  ■ 

Theorem  11:  A  search  C0((|),y)  is  sound  if  3  d  ((])).  V  gi  G  ran  y.  gj  is  sound  for  d  ((j)). 

Proof:  Assume  an  assignment  S  exists  such  that  N(1)[S]  =  true . 

By  Lemma  10,  ©((j) ,  y)must  represent  d  ((])). 

By  definition  of  represents, 

3q  G  ©((j) ,  y).3  s'  G  Sn  .q  =  Yaq  <  s'  a  s  »d(«  s'. 

Because  ©(([) ,  y)  £  and  VarN  <  s'  =  s',  q  =  s'. 

This  implies  that  s'  g  ©((|)  ,  f).  ■ 
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3.  Bounded  Generation 

In  the  first  section,  I  introduced  bounded  generation  by  means  of  a  few  examples.  In  this  section,  I 
clarify  what  is  meant  by  bounded  generation  and  prove  that  any  bounded-generation  generator  is 
sound. 

3.1  Overview  of  Bounded  Generation 

As  introduced  in  the  first  section,  bounded  generation  limits  the  values  to  be  generated  for  any 
variable  by  limiting  the  underlying  universe  of  elements.  Using  the  claim  uniqueAddrAlloc  from 
Figure  1, 1  have  demonstrated  two  concrete  examples:  constraining  the  value  of  a  to  be  an  element 
of  the  value  of  used  and  constraining  the  domain  and  range  of  the  value  of  usage  to  be  subsets  of 
the  domain  and  range  of  the  value  of  usage'. 

The  first  example  demonstrates  a  direct  constraint  on  the  universe  of  elements  from  which  a  scalar 
value  is  chosen.  In  the  second  example,  a  relation  is  restricted  to  a  (generally)  smaller  universe  of 
elements  for  the  domain  and  the  range.  In  each  example,  bounded  generation  uses  information 
derived  from  the  formula  to  reduce  the  possible  values  to  be  generated. 

This  reduction  is  the  essence  of  bounded  generation.  Unlike  most  other  techniques,  bounded 
generation  does  not  provide  any  self-contained  generators.  Instead,  bounded  generation  depends 
on  other  generators,  possibly  even  using  generators  implementing  other  reduction  techniques. 

Another  example  of  bounded  generation  begins  with  the  atomic  formula  used'  =  (used  U  {a}) 
found  in  Formula  2.  Any  assignment  satisfying  this  formula  also  satisfies  the  formula  used  <= 
used'.  This  new  formula  logically  implies  used  &  (  Un\used'  )  =  {}. 

For  bounded  generation  of  set  variables,  such  as  used,  the  goal  is  to  divide  the  universe  of 
elements  TI  into  three  distinct  subsets:  elements  that  are  required  to  be  included  in  the  set  denoted 
by  the  variable,  elements  that  are  never  in  that  set,  and  elements  that  may  possibly  (but  not 
necessarily)  be  in  that  set.  For  this  example,  any  element  contained  in  the  set  given  by 
Sset(Un\used'  )  can  never  be  in  the  set  denoted  by  used  for  any  satisfying  assignment.  With  a 
different  value  for  Ord  (generating  values  for  a  before  generating  values  for  used),  it  could  also 
be  determined  that  the  element  denoted  by  a  is  in  the  set  denoted  by  used  for  any  satisfying 
assignment.  This  relationship  is  illustrated  in  Figure  4. 

The  set  of  elements  that  may  possibly  be  included  is  described  by  a  term,  which  I  call  T*.  For  this 
example,  (P  is  (used'  \  a).  The  set  of  elements  that  are  required  for  any  satisfying  assignment  are 
described  by  a  similar  term,  which  I  call  ‘K  For  tWs  example,  'R  is  a.  The  underlying  generator 
needs  to  consider  only  the  elements  contained  in  Bounded  generation  unions  each  value 

yielded  by  the  underlying  generator  with  the  value  of  Sset(^).  If  the  underlying  generator 
generates  each  subset  of  Sset(^P),  bounded  generation  will  be  sound.  In  fact,  as  will  be  shown  in 
the  following  sections,  bounded  generation  is  sound  when  combined  with  some  underlying 
generators  that  do  not  yield  all  possible  subsets  of  Sset(^P). 

For  scalar  variables,  there  is  no  subset  of  elements  always  included,  so  the  value  of  R  is  always  {}. 
A  naive  underlying  generator  could  yield  each  element  of  Sset(^),  which  the  bounded-generation 
generator  would  yield  unchanged. 

Relational  variables  are  the  most  complicated  for  bounded  generation.  The  most  consistent  and 
efficient  definition  of  bounded  generation  for  relations  would  consider  sets  of  edges,  rather  than 
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Figure  4:  Partitioning  'll  into  three  sets  for  bounded  generation  of  used.  The  first  (leftmost)  partition  includes  all 
elements  that  are  required  to  be  included  in  any  the  value  of  used  for  any  satisfying  assignment.  In  this  example, 
this  is  simply  the  value  denoted  by  the  variable  a.  The  term  ^Ris  used  to  describe  these  values.  The  second 
partition  includes  all  elements  that  may  possibly  be  included  in  the  value  of  used.  The  term  (Pis  used  to  describe 
these  values.  The  third,  and  final,  partition  includes  all  elements  that  must  not  be  included  in  the  value  of  used.  In 
this  example,  this  is  the  value  Sggt(  Un\  used').  Only  the  elements  in  the  middle  partition  need  to  be  considered 

by  the  underlying  generator. 


the  sets  of  elements  used  for  scalar  and  set  generation.  This  definition,  however,  significantly 
complicates  both  the  definition  and  implementation  of  bounded  generation.  Instead,  I  use  sets  of 
elements  for  bounded  generation  of  relations  as  well.  One  set  describes  the  elements  that  may  or 
may  not  be  in  the  domain  of  the  relation  for  a  satisfying  assignment.  I  call  the  term  describing  this 
set  (P,  as  with  scalar  or  set  generation.  There  is  a  similar  term,  which  I  calljj,  that  describes  the  set 
of  elements  that  may  or  may  not  be  part  of  the  range.  As  with  scalar  generation,  (R  is 

uninteresting,  so  ‘R  is  always  {}^. 

3.2  Tj;)  Limitation 

Bounded  generation  depends  on  generating  values  from  a  limited  universe  of  elements.  In  this 
section,  I  develop  a  notation  for  describing  the  underlying  limitations. 

These  limitations  are  based  on  two  elements  from  Terniset,  referred  to  as  T  mdj).  The  currently 
generated  partial  assignment  S  provides  the  context  to  evaluate  these  terms,  using  the  Sje, 
function.  For  scalar  values  and  set  values,  only  (P  is  interesting,  with  Sset(fO  yielding  the  base  set 
from  which  elements  may  be  drawn.  For  relational  values,  Sset(^P)  limits  the  domain  and  Sset^) 
limits  the  range. 

Definition  32:  For  any  Rjp  e  Termset>  value  X  is  (^-limited  for  assignment  S  iff 
FV((P)  c  dom  s  A  FV^)  c  dom  s  a 
(X  S  Valuescalar  =>  X  G  S((P))  A 


5.  As  will  be  shown  in  Section  6,  where  I  refine  the  model  to  distinguish  functions  from  general  relations,  a 
term  describing  a  set  of  edges  equivalent  to  (R  is  useful  for  the  generation  of  functions. 
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(X  e  Valucset  =>  X  c  S(T))  A 

(X  e  Valucrei  X  c  {  (y,z)  I  y  e  s(T)  a  z  e  s^) } ) 

The  limit  functions  generate  ^-limited  values  and  assignments  for  most  values  and  assignments. 
These  functions,  however,  cannot  ^^-limit  a  scalar  value  that  is  not  already  ^^-hmited,  or  an 
assignment  that  maps  a  variable  to  a  scalar  value  that  is  not  already  (^-limited. 

Definition  33:  For  the  function  limit^P:ValueXS — ^Value  u  {unknown}, 
limit^Ax,s)  is 

if  (FV((P)  c  dom  s  A  FV^)  c  dom  s) 
unknown 
ifxeValue.caiar 

X  if  X  e  s(T) 

unknown  otherwise 

if  X  G  ValuCset 

I  y  I  y  £  s((P)  A  y  G  X  } 

if  X  G  ValuCrei 

{ (y,z)l  y  e  s((P)  A  z  G  s(p)  A  (y,z)  g  x  } 

Definition  34:  The  function  limitj^^rS^S  is  defined  as 
limiti^(S)  = 

{  Vi-^Xl  VG  dom  S  A  Ord(V)  AX  =  s(v) }  u 
{  Vh^xI  VG  dom  S  A  Ord(V)=i  a  X  =  limit^(s(v),s)  } 

For  scalar  variables,  the  function  limitj^  leaves  Vj  unbound  in  the  resultant  assignment  if  the  value 
S(v)  is  not  already  ^^-limited. 

A  generator  can  be  ^-limiting. 

Definition  35:  For  any  g  Termset  such  that  FV(^P)  c  Var|.i  and  FV^)  c  Varj.i ,  a  level  i 

generator  is  a  '^-limiting  generator  iff  Vsg  Sj.i  .Vs'g  gi'^(s).s'(Vj)  is  ([^-limited  for  s. 

An  exhaustive-enumeration  -limiting  generator  can  be  defined. 

Definition  36:  For  some  ‘P,JJ  G  Termset  such  that  FVCP)  c  Varj.i  and  FV^)  c  Varj.i . 
the  level  i  generator  gO^:  Sj.i  — >PSj  is  defined  as 

if  Vj  G  Var^Qaiaj. 

gO^(s)  =  {  s'  1 3x  G  s('P).  s'  =  s  u  {  Vj  X  }  } 

ifVjG  Varget 

gO^(s)  =  {  s'  1 3x  G  Ps(^P).  s'  =  su{Vj'->x}} 

if  Vj  G  V arjel 

gO^(s)  =  {  s'  i  3x  G  Pi(^P)  X  s(jj).s'  =  s  u  {  Vj  x  }  } 

Theorem  12:  The  level  i  generator  gO^  given  by  Definition  36  is  a  (3^ -limiting  generator. 

Proof:  Let  g  Termget  such  that  FV(^P)  £  Varj.i  and  FV(p)  c  Varj.i  and  S  G  Sj.i  . 
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if  Vj  E  VargQ^jaj. 

Vs'  £  gO^(s).s'(Vj)  e  s{(P) 

Therefore,  s'(V|)  is  (Pp-\\m\ted  for  s. 

The  other  cases  follow  similarly.  ■ 

Just  as  a  set  of  assignments  can  represent  a  duplication,  a  set  of  assignments  can  be  a  '^-limited 
representative  of  a  duplication. 

Definition  37:  For  any  T,y>  e  Terniget  such  that  FV((P)  c  Varj.i  and  FV^)  c  Varj.i ,  a  set  Lj  c  Sj  is  a 
‘^-hmited  representative  of  d  ((|))  iff 

Vs  G  Sjsj.  I=(l)[s]  =  true  =>  3s'  G  Sn.S  s'  a  limit j'^(Vari  <|  s')  G  L|. 

This  is  almost  the  same  definition  as  was  used  for  represents  originally.  Instead  of  matching  a 
partial  assignment  from  each  equivalence  class  in  d  (([)),  it  is  now  necessary  to  match  a  ‘Pjj-hmited 
partial  assignment  from  each  equivalence  class. 

Definition  38:  A  ‘^^-limited  level  i  generator  is  ^-limited  sound  for  d  ((j))  iff 

Vs  G  Sn.I^^[s]  =  true  =>  3s'e  S^.S  s'  a  limitj^(Varj  <  s')  g  gi^ffYarj.-i  <  s). 

As  before,  a  generator  is  sound  if  it  yields  a  value  equivalent  to  each  possible  value,  except  it  now 
yields  the  ^-limited  equivalent  value.  The  aggregate  generator  for  a  (^limited  sound  generator 
yields  a  (^-limited  representative  set  when  given  a  (J^limited  representative  set. 

Theorem  13:  If  a  C^-limited  level  i  generator  with  the  aggregate  generator  G-^  is  (^-limited 
sound  for  d  (({))  then 

VQj.i  c  Sj.-).  Qj.i  represents  d((|))  =>  Gj^(Qj.i)  (^-limited  represents  d((|)). 

Proof:  By  construction. 

Assume  S  G  Sn.I=<1)[s]  =  true. 

Because  Qj.i  represents  d((|)), 

3s  G  Sjij.S  ~d((j))  s'  A  Varj,.|  <]  S'  G  Qj.-i. 

Therefore,  by  definition  of  ^^-limited  sound, 

3s"g  Si,j.limit|^(Varj  <  S")  £  gj^P(Varj..|  <  s')  a  S'  S". 

By  transitivity,  S  »d(<|))  s". 

Therefore,  because  limiti^PfVarj  <  s")  G 

Gi^(Qi.i)  is  a  Tjp-Bimted  representative  of  d  ((|)).  ■ 

The  d^-limited  exhaustive-enumeration  generator  defined  earlier  is  ([^-limited  sound. 

Theorem  14:  The  exhaustive-enumeration  ^^-hmiting  generator  gO'^  defined  by  Definition  36  is  a 
^^-limited  sound  level  i  generator  for  any  duplication  d  ((|))  and  formula  (|)  such  that  Vj  g 

Varggaiar 

Proof:  There  are  three  cases  to  consider  based  on  the  kind  of  variable 
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^  ^^^'scalar 

By  definition,  N(l)[s]  =  true  =>  s(Vj)  e  s((P) . 

By  definition  of  gO^, 

Vs  e  Sn.Vx  e  s(^P).3s'  e  gO'^/(Varj.i  <  s).s'(Vj)  =  x 
Therefore,  Vs  g  Sn.I=(()[s]  =  true  =>  s  g  gO^(Varj.i  <i  s) 

Therefore,  limitj^(Var|  <  s)  g  gO^(Varj.i  <  s) 

Because  s  =d((t))  s  by  definition,  gO^  is  ^^-limited  sound. 
IfVjGVarset 

By  definition  of  limit^P,  Vs  G  S^-Vx  g  Valueseflimit'^(X,S)  c  S((P). 
Therefore,  Vs  g  SN.limit^P(s(Vj),s)c  S((P). 

By  definition  of  gO'^,  Vs  g  S^.Vx  c  s(T’).3s'  g  g0^i’(s).s'(V|)  =  x 
Therefore,  Vs  g  SN.limitj^(Vari  <  s)  g  gO^(s), 

A  similar  argument  follows  when  Vj  g  Varj-ei- 


3.3  Definition  of  Bounded  Generation 
I  now  define  bounded  generation  in  terms  of  (^-limiting. 


Definition  39:  For  any  ‘P,jj,‘R  g  Termget,  a  level  i  generator  bgi^^  is  a  bounded  generator  for  (|) 
using  a  (^-limiting  generator  iff 
FV((P)  c  Varj.1  a  FV^)  c  Var|.i  a  FV(^R)  c  Varj.i  a 
(Vj  G  Varscalar  => 

(jlNVj  in  (P  A =  { }  A  (R  =  { }  A 

bgi^(s)  =  gi^p(s))  a 

(VjGVarset=> 

(|)t=Vj  <=  (^P  U  R)  A  (j)f  R  <=  Vj  A^  =  { }  A 

bgj'^^Cs)  =  {  s'  1 3s"  G  gi^(s).  s'=s"u  i(R)})  a 

(Vj  G  Varjei 

(t)l=dom  Vj  <=  R  A  (|)l=ran  V|  <= j?  a  R  =  { }  a 
bgi'^Cs)  =  gi^/(s)) 

This  definition  exactly  corresponds  to  the  overview  of  bounded  generation  given  earlier.  All  the 
terms  used  must  be  well  defined  for  any  partial  assignment  considered  as  an  input.  For  scalar 
variables,  the  generator  utilizes  an  underlying  generator  that  considers  only  the  possible  values,  as 
given  by  the  term  R  For  set  variables,  the  generator  unions  the  required  elements,  as  indicated  by 
the  R  term,  with  each  value  yielded  by  the  underlying  generator  considering  only  the  non- 
required  possible  elements,  as  given  by  the  R  term.  For  relational  variables,  the  generator  yields 
values  given  by  the  underlying  generator  considering  a  reduced  set  of  possible  elements  for  the 
domain  (indicated  by  the  term  R)  and  for  the  range  (indicated  by  the  term^). 
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A  bounded-generation  generator,  when  combined  with  a  !^-hmited  sound  generator,  is  sound  for 
appropriate  partial  assignment  duplications. 

Theorem  15:  If  Vj  e  Vargcaiaj  and  is  (^-limited  sound  for  PAd((j),Vj  in  T),  then  the  level  i 

bounded  generator  bgi^  using  the  exhaustive-enumeration  (l^-limiting  generator  is 
sound  for  PAd((|),Vj  in  T). 

Proof: 

Because  (l)f=Vj  in  (P,  V  s  e  Sn.  N(|)[s]  =  true  =>  s(Vj)  e  s(T) 

By  definition,  s(Vj)  e  s((P)  =>  limitj'^(s(Vj))  =  S(Vj) 

Therefore,  V  S  s  Sn-  f=(t)[s]  =  true  =>  limitj^  ( Varj  <  S )  =  Vari  <l  S. 

Therefore,  is  sound  for  PAd  (([) ,  Vj  in  ^P) . 

Because  =  gfP  when  V|  e  Var^caiar. 

bQj^'^  is  sound  for  PAd  ((|) ,  V|  in  (P)  .  ■ 

Theorem  16:  If  Vj  E  Varget  and  g  fP  is  (^-limited  sound  for  PAd((|),(V|  <=  {P  U  P)  and  P  <=  Vj)), 
then  the  level  i  bounded  generator  bgi^^  using  the  exhaustive-enumeration  ^3^-limiting 
generator  gj^  is  sound  for  PAd((|),(Vj  <=  ((P  U  P)  and  P  <=  Vj)). 

Proof:  By  construction. 

Let  s  E  Sn.I=(1)[s]  =  true. 

Because  gj^P  is  ^-limited  sound  for  the  duplication, 

3s'  E  Sfj.S  =pAd  s'  A  limitj^P(Varj  <  s')  E  gj^(Varj.i  <1  S). 

By  the  definition  of  and  because  N(|)[s]  =  true,  s  s:pAd  s'  =>  s  =  s'. 

Therefore,  limitj^(Varj  <]  S)  E  gj^CVarj.^  <  S). 

By  definition,  limit j^P(Varj  <  s)(Vj)  =  s(Vj)  n  S(^P). 

Because  (|)KVj  <=  {P  U  P)  and  P  <=  Vj), 

(JlNVj  <=  (PU  P)  A  (1)I=!R  <=  Vj 

Therefore,  t=(|)[s]  =  true  =>  s(Vj)  c  s((P  U  P))  a  s(P)  q  s(Vi). 

Therefore  S(Vj)  c  s(P)  u  s(CR). 

Therefore,  s(Vj)  q  (s(Vj)  n  S(P))  u  S(^R). 

Because  i((R)  c  s(Vj),  (s(Vj)  n  s(P))  u  s((R)  c  s(Vj). 

Therefore,  (s(Vj)  n  s((P))  u  s(P)  =  s(Vj). 

Therefore,  limitj^(Varj  <1  s)(Vj)  u  S(CR)  =  s(Vj). 

Therefore  S  e  bgj^'’'’^(Varj.i  <  s).  ■ 

Theorem  17:  If  Vj  E  Varj-gi  and  gj^  is  (J^-limited  sound  for  PAd((|),(dom  Vj  <=  P  and  ran  Vj  <= 
jj)),  then  the  level  i  bounded  generator  bgj'^^  using  the  exhaustive-enumeration  ^^-limit- 
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ing  generator  is  sound  for  PAd((j),(dom  Vj  <=  T  and  ran  Vj  <=j?)). 
Proof:  By  a  similar  argument  used  in  the  proof  for  Theorem  16. 
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4.  Isomorph  Elimination 

This  section  begins  by  providing  an  overview  of  isomorph  elimination.  The  second  subsection 
defines  permutation  duplicates  based  on  automorphisms  on  'Ll,  the  universe  of  atomic  elements. 
Using  the  definition  of  the  formula  language,  I  show  that  applying  an  automorphism  function  to 
all  of  the  values  in  the  range  of  an  assignment  does  not  change  the  interpretation  of  the  formula 
for  the  modified  assignment.  I  define  isomorph  elimination  in  terms  of  these  automorphisms. 
From  this,  I  show  that  isomorph-eliminating  generators  are  sound. 

In  the  final  subsection  ,  I  show  that  an  isomorph-eliminating  generator  can  be  used  as  the 
underlying  (1^-limiting  generator  for  sound  bounded  generation. 

4.1  Overview  of  Isomorph  Elimination 

Conceptually,  isomorph  ehmination  is  a  direct  outgrowth  of  a  simple  observation:  two  isomorphic 
assignments  will  give  the  same  interpretation  to  any  formula.  Two  assignments  are  isomorphic  if 
there  is  a  consistent  shuffling  of  the  elements  in  one  assignment,  called  a  relabeling,  that  yields 
the  second  assignment.  As  the  elements  of  Ti  are  unstmctured,  no  two  elements  are 
distinguishable,  except  through  prior  usage.  Therefore,  a  relabeled  assignment  must  give  the  same 
interpretation  to  a  formula  as  the  original  assignment. 

Because  of  the  incremental  approach  taken  in  selective  enumeration,  an  isomorph-eliminating 
generator  considers  only  relabellings  that  do  not  effect  the  partial  assignment  already  computed. 
For  an  initial  partial  assignment,  an  isomorph-ehminating  generator  can  safely  exclude  any  value 
for  the  new  variable  that  is  a  relabeling  of  another  value  that  is  generated  if  that  relabeling  leaves 
the  initial  partial  assignment  unchanged.  In  this  way,  the  values  already  generated  in  earlier  levels 
limit  the  possible  relabellings  to  consider  in  this  new  level. 

Ideally,  the  generator  would  guarantee  that  no  assignments  generated  for  a  given  level  of  the 
search  tree  would  be  isomorphic  to  each  other.  However,  there  are  two  difficulties  in  achieving 
complete  isomorph  elimination.  As  the  generator  considers  only  a  single  partial  assignment  from 
the  previous  level  at  a  time,  it  cannot  guarantee  that  two  assignments  generated  from  two  different 
initial  partial  assignments  are  not  isomorphic.  Secondly,  perfect  recognition  of  isomorphs  is  itself 
non-polynomial,  so  acceptable  performance  requires  the  use  of  a  heuristic. 

Basic  isomorph  elimination  is  dependent  solely  on  the  formula  language,  rather  then  on  the  actual 
formula  being  analyzed,  as  is  bounded  generation.  Although  further  reductions  can  be  gained  by 
considering  the  formula  [JJD98],  I  ignore  those  considerations  in  this  analysis. 

4.2  Automorphisms 

An  automorphism  is  a  function  that  performs  a  consistent  relabeling  of  a  value  or  an  assignment. 

Definition  40:  A  one-to-one  function  li: Value— >Value  is  an  automorphism  for  'll  iff 
Vy  e  Valucset-Vx  e  ValuCgcaiar-X  G  y  <=>  (i(x)  e  (i(y)  a 

Vz  G  Valuerei.Vx,X'  G  Valuescalar-X'e  Z(X)<:^>  fi(X')  G  /l(Z)(li(X)) 

Definition  41:  For  any  automorphism  fi,  fi(S)  =  {  V^->X  I V  G  dom  S  a  x  =  ^(S(v))}. 

Furthermore,  fi(s)  is  the  semantic  function  induced  by  fi(s). 
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Because  all  of  the  elements  of  'll  are  unstructured,  applying  any  automorphism  to  an  assignment 
does  not  change  the  meaning  of  that  assignment  for  any  terms  or  formulae. 

Lemma  18:  For  any  automorphism  fi,  term  T,  and  assignment  S, 
s(T)  ^  unknown  =>  fi(s{X))  =  /i(s)(T). 

Proof:  By  structural  induction  on  the  definition  of  S(T). 

If  X  is  V  where  V  e  Variable 

By  definition  of  s,  s(T)  =  s(  v) . 

By  definition  of  fi(S),  /i(s)(X)  =  fi(S(v)). 

if  X  is  (Xi  U  X2)  where  Xi  ,X2  e  Term^et 

Because  s(X)  ^  unknown,  s(Xi)  ^  unknown  andi(X2)  ^  unknown. 

By  induction,  /i(i(Xi))  =  ^(i)(Xi)  and  ^(i(X2))  =  fi(s){X2). 

Therefore,  by  definition  of  S  ,  /i(s(X))  =  ^(S)(X) 

Other  productions  follow  similarly  ■ 

Theorem  19:  For  any  automorphism  it,  formula  (]),  and  assignment  S, 
l=(j)[s]  ^  unknown  =>  f:(|)[s]  =  N(|)[^(s)]. 

Proof:  By  Lemma  18  and  structural  induction  on  the  definition  of  the  formula  language. 

If  (|)  is  Xi  in  X2  where  Xj  e  Term^caiar  and  X2  e  Termset 

Because  N(|)[s]  *  unknown,  s(Xi)  ^  unknown  and  s(X2)  *  unknown. 

By  definition  of  N,  if  N(t)[s]  =  true,  s(Xi)  e  i(X2). 

Therefore,  by  Definition  41,  /i(s(Xi))  e  Ii(s(X2)). 

Therefore,  by  Lemma  18,  ^(i)(Xi)  g  /i(i)(X2). 

Therefore,  l=(|)[fi(s)]  =  true 
By  definition  of  N,  if  l=(t)[s]  =  false,  s(Xi)  e  i(X2). 

Therefore,  by  Definition  41,  /i(s(Xi))  G  fi(s(X2)). 

Therefore,  by  Lemma  18,  /i(S)(Xi)  G  fi{S)(X2). 

Therefore,  l=(l)[/i(s)]  =  false 

If  ([)  is  ((|)i  and  (|)2)  where  (|)i  ,(j)2  G  Wff 

If  N(|)i  [s]  =  true,  N(|)]  [li(s)]  =  true  by  induction 
If  N(|)2[s]  =  true,  l=(l)2[/t(s)]  =  true  by  induction 
Therefore,  by  definition  of  N,  if  l=(t)[s]  =  true,  N(|)[fi(s)]  =  true 
If  l=(|)i[s]  =  false,  N(l)i[^(s)]  =  false  by  induction 
If  l=(|)i[s]  =  false,  N(l)i[/t(s)]  =  false  by  induction 
Therefore,  by  definition  of  f,  if  t=<l)[s]  =  false,  l=(j)[^(s)]  =  false 
Other  productions  follow  similarly  | 

An  isomorph-eliminating  generator  considers  only  automorphisms  that  leave  the  initial  partial 
assignment  unchanged.  These  automorphisms  are  called  identities. 


A  Formal  Definition 


33 


Definition  42:  An  automorphism  fi  is  an  identity  for  an  assignment  S  iff  S  =  fi(s) 

For  any  empty  assignment  or  an  assignment  containing  only  empty  sets  and  empty  relations,  any 
automorphism  is  an  identity.  For  the  assignment  considered  etirlier 

{  usage'i-»{  (ao.vo) },  used'i-»0} 

any  automorphism  that  maps  ao  to  Bq  and  Vq  to  Vq  is  an  identity. 

Identities  for  the  more  complicated  assignment 

{  usage'H^l  (ao.Vo),  (ai.Vo) },  used'^{ao,  aj } } 

include  automorphisms  that  map  ao  to  aj,  ai  to  ao,  and  Vo  to  Vq  as  well  as  the  expected 
automorphisms  that  map  ao  to  ao,  aj  to  ai,  and  Vo  to  Vo. 

If  an  automorphism  is  an  identity  over  a  partial  assignment,  it  is  an  identity  for  any  value  obtained 
by  evaluating  a  term  using  that  assignment. 

Lemma  20:  If  It  is  an  automorphism  that  is  an  identity  for  a  partial  assignment  S  e  Sj  and  X  e  Term 
such  that  FV(X)  c  Varj,  then  lt(s(X))  =  S(X). 

Proof:  By  structural  induction  on  term  language. 

By  Theorem  1,  s(X)  ^  unknown. 

If  X  is  V,  where  v  e  Variable 

S(X)  =  S(X)  by  definition 

Because  It  is  an  identity  for  s,  fi(S(X))  =  S(X) 

If  X  is  (Xi  U  X2),  where  Xi,X2  e  Termget 

By  induction,  /i(s(Xi))  =  s(Xi)  and  fi(s(X2))  =  s(X2) 

Therefore,  /i(s(X))  =  s(X) 

Other  productions  follow  similarly.  ■ 

4.3  Definition  of  Isomorph  Elimination 

To  define  isomorph  elimination  precisely,  it  is  useful  to  start  with  a  definition  of  the  duplication 
being  reduced  by  the  generator.  The  duplication  places  any  two  assignments  in  the  same 
equivalence  class  if  they  are  isomorphic  to  each  other. 

Definition  43:  An  equivalence  relation  is  a  permutation  equivalence  relation  iff 
Vs,s'  6  Sn-  s  s'  <=>  (3fi.  ft  is  an  automorphism  function  and  s'  =  fi(s)) 

Lemma  21:  The  partitioning  defined  by  is  a  duplication. 

Proof:  By  Theorem  19. 

Definition  44:  7Cd((|))  is  the  duplication  induced  by 

The  goal  of  an  isomorph-eliminating  generator  is  to  generate  only  assignments  that  are  not 
isomorphs  of  other  assignments  generated. 
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Definition  45:  A  level  i  generator  Qj  is  an  isomorph-eliminating  generator  iff 
Vs  E  Sj.-j  .Vx  E  Valuers'  E  gj(S). 

(Bfi.a  is  an  automorphism  a  ^  is  an  identity  for  s  a  s'(Vi)  =  fi(x)) 

An  isomorph-eliminating  generator  is  any  generator  that  is  guaranteed  to  generate  at  least  any 
value  for  this  level  that  is  not  isomorphic  to  another  value  also  generated.  As  was  noted  earlier,  it 
is  not  realistic  to  require  an  isomorph-eliminating  generator  to  remove  all  isomorphs.  As  such,  the 
exhaustive-enumeration  generator  given  in  Definition  21  is  a  valid  isomorph-eliminating 
generator,  albeit  an  extraordinarily  inefficient  one.  In  practice,  a  middle  ground  is  available;  an 
efficient  generator  that  eliminates  almost  all  isomorphs  can  be  implemented  with  reasonable 
effort. 

Because  an  isomorph-eliminating  generator  only  drops  an  assignment  if  there  is  an  automorphism 
that  relabels  that  assignment  into  one  that  is  generated,  there  is  at  least  one  element  generated  for 
each  equivalence  class  in  Jld  ((|)) . 

Theorem  22:  An  isomorph-eliminating  generator  Qj  is  sound  for  TCd  (({)). 

Proof:  Assume  S  £  Sn.N(|)[s]  =  true. 

Letx  =  s(Vj). 

By  definition  of  an  isomorph-eliminating  generator, 

3s' G  gi(Varj.i  <s). 

(Bfi.fi  is  an  automorphism  a 

fi  is  an  identity  for  s  a  S'(Vj)  =  fl(X)) 

Therefore,  li(s)  =  s'. 

Therefore,  S  s'.  g 

4.4  Interaction  of  Isomorph  Elimination  and  Bounded  Generation 

To  minimize  the  number  of  assignments  generated,  selective  enumeration  utilizes  all  the 
duplications  available.  As  shown  in  Theorem  6,  different  partial  assignment  duplications  combine 
to  reduce  the  number  of  assignments  for  a  single  formula  in  a  straightforward  manner.  The 
complexity  comes  when  combining  a  partial  assignment  duplication  with  a  permutation 
duplication:  PAd((|),(j)')  o  Jld  (([)). 

If  the  partial  assignment  duplication  enables  a  derived  variable,  only  one  assignment  will  be 
generated  and  combination  with  isomorph  elimination  is  unnecessary.  Short  circuiting,  as 
described  in  the  next  section,  is  really  more  of  a  post-filter  than  a  generator  and  combines  with 
any  other  generators  in  a  straightforward  manner.  The  issue  is  therefore  limited  to  combining 
isomorph-ehminating  generators  with  bounded-generation  generators.  The  approach  is  to  utilize 
an  isomorph-eliminating  generator  as  the  underlying  generator  for  a  bounded-generation 
generator. 

This  requires  a  (^-limiting  version  of  an  isomorph-eliminating  generator  to  be  defined. 

Definition  46:  For  e  Terniset  such  that  FV(T)  c  Varj.i  and  FV^)  c  Varj.i ,  a  level  i  generator 
g-,'^  is  a  ^^-isomorph-eliminating  generator  for  formula  (|)  iff 
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Vj  G  Var^g^jg].  Vs  £  Sj.i  .Vx  G  ValuCgggjgj.. 

(  X  G  S((P))  V 

3s'  G  gi(s). 

(Bfi.fi  is  an  automorphism  a 
fi  is  an  identity  for  S  a 
s'(Vj)  =  /i(limit^/(x,s)))  A 
VjGVarset  =>  Vs  G  S^-Vx  G  ValuBsef 
3S'  G  9,(3). 

(Bfi.fi  is  an  automorphism  a 
fi  is  an  identity  for  S  a 
s'(Vj)  =  /i(limit'^(x,S)))  A. 

VjGVari.ei  =^>  Vs  G  Sj.i.Vx  £  Valuej-ej. 

3S'  G  QiCs). 

(Bfi.fi  is  an  automorphism  a 
a  is  an  identity  for  s  a 
s'(Vj)  =  /l(limit^-P(X,S))). 

A  ^^-isomorph-eliminating  generator  generates  a  ^-limited  value  for  each  possible  value  or  it 
generates  a  relabelling  of  the  ^^-limited  value.  But  a  relabelling  of  the  (^-limited  value  is  itself  a 
•i^-limited  value.  Therefore,  a  ^^-isomorph-eliminating  generator  is  a  (^-limiting  generator. 

Lemma  23:  A  (^-isomorph-eliminating  generator  is  a  (^-limiting  generator. 

Proof:  Need  to  prove  that  Vs  g  Sj.i.Vs'  g  gi^(s).s'(Vj)  is  ^1^-limited. 

By  definition,  3x.s'(V|)  =  li(limit^(X,S)). 

Because  lus  an  identity  for  s  and  FV((P)  c  dom  s  and  FV(jj)  c  dom  s, 
fi{s((P))  =  s((P)  and  fi(s(j}))  =  s(j)). 

If  Vj  G  VatspgJgj., 

limit^P(X,S)  =  X 

By  definiti(m,  x  g  s^P)  =>  fi(x)  eJi{s(‘P)) 

Because  fi(S(T))  =  S(T),  fi(x)  e  s('P) 

Therefore,  s'(Vj)  is  (^-Umited 

If  Vj  G  Varset 

limit'^(X,S)  =  Xni(!P) 

SO  /l(limit'?P(X,S))  =  fl{X)  n  fi(S(T)) 

SO  /i(limit^(x,s))  =  fi(x)  n  s(!P) 

Therefore,  s'(Vj)  is  (^-limited 

IfVjGVarrel 

limit^(X,S)  =  X  n  S((P)XS(^) 

SO  li(limit^(x,s))  =  fi(x)n  fi(s(‘P))xfi(s(y>)) 

SO /i(limit^(x,s))  =  fi(x)  ns(T)xs(f) 

Therefore,  s'(V|)  is  (^-hmited 

Therefore,  gi^  is  a  (^-limiting  generator.  ■ 
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A  ‘^-isomorph-eliminating  generator  is  also  ^J^-limited  sound. 

Lemma  24:  A  iJ^-isomorph-eliminating  generator  Qj'd’  is  (Pjj-hmited  sound  for  nd  ((j)). 

Proof:  Assume  s  G  SN.t"(I)[s]  =  true. 

Need  to  find  an  element  s'  g  Sn-S  s'  a  Varj  <i  s'  g  gj'^CVarj.-i  <  s) 

By  definition,  3s"  g  g|^/(Var|.i  <|  s). 

3/t./i(s(Vj))  =  s"(Vj)  A  fi  is  an  identity  for  Var|.i  <I  S. 

Therefore,  /i((Varj  <1  S)  =  s". 

Therefore  3s'  g  Sn.S  s'  a  Varj  <1  s'  =  s". 

Therefore,  Varj  <  s'  G  gi^(Var|.i  <i  s).  | 

Together,  a  (^-isomorph-eliminating  generator  and  a  bounded-generation  generator  generate 
fewer  assignments  than  either  a  simple  isomorph-eliminating  generator  or  a  bounded-generation 
generator  paired  with  an  exhaustive-enumeration  generation.  The  combination,  however,  is  still 
sound,  now  for  the  combination  of  an  appropriate  partial  assignment  duplication  with  the 
permutation  duphcation. 

Theorem  25:  If  Vj  G  Vaigcaiar  and  g  fP  is  a  (^-limiting  isomorph-eliminating  generator,  then  the 
level  i  bounded  generator  bgi^P^  using  gi^  is  sound  for  PAd((|),V|  in  (P)  <>  7Cd  ((])). 

Proof:  Assume  s  G  Sn  such  that  l=(|)[s]  =  true. 

Need  to  prove  3s'  g  Sn.S  »  s'  a  Var;  <  s'  g  bgi^P^S). 

Because  (|)I=V|  in  (P,  V  s  g  S^.  I=(|)[s]  =  true  =>  s(V|)  g  s(CP) 

By  definition,  S(V|)  g  S(T)  =>  limit|^(s(V|))  =  s(V|) 

Therefore,  V  S  g  S^,.  I=(t)[s]  =  true  =>  limitj'^  (Varj  <l  S )  =  Varj  <I  s. 

By  Lemma  24,  gj^P  is  (1^-limited  sound  for  7td((|)). 

Therefore,  gj^  is  sound  for  TCd  ((])). 

Therefore,  by  Theorem  9,  gj^P  is  sound  for  PAd((|),Vj  in  (P)  o  7i:d((|)). 

Since  bgj^  simply  passes  through  the  assignments  generated  by  gff, 

bgfJ’^  is  sound  for  PAd  ((j),Vj  in  (P)  o  7Cd  (<j)) .  g 

Theorem  26:  If  Vj  G  Vatget  and  gj^  is  a  ^^-limiting  isomorph-eliminating  generator,  then  the  level 
i  bounded  generator  bg  fJ^  using  gj^P  is  sound  for  PAd  ((j),(Vj  in  (P  U  P)  and  P  <=  v-,))  o 
Tld  ((])). 

Proof:  Assume  S  g  Sn  such  that  l=(|)[s]  =  true. 

By  definition  of  gj^, 

3s'  G  gj(s). 

(Bfi.fi  is  an  automorphism  a 
fi  is  an  identity  for  S  a 
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s'(Vj)  =  /I(limit^(S(Vi),S))). 

Therefore,  6(limitj^(Varj  <  S))  =  s'. 

By  definition,  limiti^f(Varj  <]  S)(Vj)  =  S(Vj)  n  S((P). 

Because  (t)l=(V|  <=  {T  U  (R)  and  (R  <=  Vj), 

(j)NV|  <=  ((P  U  R)  A  (t)t=R  <=  V| 

Therefore,  l=(l)[s]  =  true  =>  s(Vj)  c  s(((P  U  R))  a  s(R)  c  s(Vj). 

Therefore  s(Vj)  c  s(R)  u  s(R). 

Therefore,  s(V|)  c  (s(V|)  n  S(R))  u  s(R). 

Because  s(R)  c  s(Vi),  (s(Vj)  n  s(R))  u  s(R)  £  s(Vj). 

Therefore,  (s(V|)  n  S(R))  u  s(R)  =  S(Vj). 

Therefore,  limitj^(Vari  <  S)(Vj)  U  S(R)  =  s(Vj). 

Therefore,  limitj^(Varj  <1  S)  =  Varj  <3  S. 

Therefore,  ^(Vatj  <  S)  =  s'. 

Therefore,  3s"  g  Sf^.fi(s)  =  s". 

Therefore,  s  «  s"  a  Vatj  <  s"  g  gi(S).  ■ 

Theorem  27:  If  Vj  G  Varrgi  and  Qj^  is  a  ^^-limiting  isomorph-eliminating  generator,  then  the  level 
i  bounded  generator  bgi^P^  using  gi'^  is  sound  for  PAd  ((l),(dom  V|  <=  R  and  ran  Vj  <= j?))  ° 
7Cd((j)). 

Proof:  By  a  similar  argument  to  the  one  used  to  prove  Theorem  26. 
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5.  Derived  Variables  and  Short  Circuiting 

This  section  formalizes  the  Selective  enumeration  techniques  called  derived  variables  and  short 
circuiting,  which  were  introduced  by  example  in  the  first  section.  Each  technique  will  be  proven 
sound. 

These  techniques  are  far  simpler  than  isomorph  elimination  or  bounded  generation.  These 
techniques  also  lack  the  complications  of  combining  techniques  found  with  bounded  generation 
and  isomorph  elimination. 

5.1  Overview  of  Derived  Variables 

A  derived  variable  is  one  that  has  a  constructive  derivation.  Typically,  one  of  the  atomic  formulae 
conjoined  in  the  formula  being  analyzed  gives  an  explicit  value  for  the  variable.  This  occurs 
several  places  in  the  example  claim  uniqueAddrAlloc.  In  the  schema  Heap,  the  atomic  formula 
used  =  dom  usage  restricts  used  to  a  single  value  for  each  value  of  usage.  Similarly,  usage 
could  be  derived  from  the  variables  used  and  usage',  using  the  atomic  formula  (used  <:  usage') 
=  usage.  Clearly,  the  use  of  one  of  these  derivations  invalidates  the  other;  either  usage  must  be 
generated  before  used  or  used  must  be  generated  before  usage.  The  derivations  available  is  both 
an  input  to  and  a  result  of  the  choice  of  variable  ordering. 

Derived  variables  can  be  considered  as  the  ultimate  application  of  bounded  generation.  For 
derived  variables,  the  value  of  the  atiable  is  determined  by  the  value  of  a  term  that  I  call  A 
scalar  variable  is  a  derived  variable  if  the  bounded  generation  term  (P  is  limited  to  a  single  element 
for  all  possible  partial  assignments.  The  term  T" defines  that  single  element.  Similarly,  a  set 
variable  is  a  derived  variable  if  the  best  bounded  generation  term  7*  is  empty  for  all  assignments; 
in  this  case  the  derivation  term  “T" is  the  same  as  the  bounded  generation  term  (R.  The  limitations 
on  bounded  generations  of  relations  prohibit  a  direct  equivalence  with  derived  variables,  however. 

5.2  Definition  of  Derived- Variable  Generation 

A  variable  may  be  derived  if  the  value  is  constrained  to  a  single  term  and  that  term  involves  only 
variables  that  precede  the  derived  variable  in  Ord. 

Definition  47:  For  a  term  T",  a  level  i  generator  gj'^is  a  derived-variable  generator  for  formula  (j),  iff 
(()i=Vi  =  T'a  FVCT)  c  Varj.1  a  Vse  Sj.i  .gi'^(s)  =  s  u  {V|  s(T^} 

A  derived- variable  generator  is  always  sound  for  any  duplication. 

Theorem  28:  A  level  i  derived- variable  generator  gi’^for  formula  (|)  is  sound  for  any  duphcation 
d  ((])). 

Proof:  Let  s  e  SN.f(|)[s]  =  true. 

Because  (|)t=Vj  =  T",  =»  s(Vj)  =  s(T). 

Therefore  Var;  <  se  gi'^(Varj.-|  <  s)  B 
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5.3  Overview  of  Short  Circuiting 

Short  circuiting  is  more  of  a  post-filter  than  a  generator  technique.  Short  circuiting  removes  all 
remaining  partial  assignment  duplicates.  Given  a  set  of  formulae  that  must  be  satisfied,  short 
circuiting  removes  any  partial  assignments  that  fail  to  satisfy  one  or  more  of  the  formulae. 

In  the  example  considered  earlier,  bounded  generation  had  limited  the  values  generated  for  usage 
such  that  the  domain  of  usage  was  a  subset  of  the  domain  of  usage'  and  the  range  of  usage  was 
a  subset  of  the  range  of  usage'.  The  formula,  however,  includes  a  stronger  constraint:  (used  <: 
usage')  =  usage.  Short  circuiting  removes  any  partial  assignments  involving  used,  usage,  and 
usage'  that  fail  to  satisfy  this  constraint. 

Like  bounded  generation,  a  short-circuiting  generator  depends  on  another  generator  to  produce 
assignments,  modifying  the  set  of  assignments  generated.  Unlike  bounded  generation,  a  short- 
circuiting  generator  may  suppress  some  assignments  generated  by  the  underlying  generator, 
preventing  their  consideration  in  the  next  level  of  the  search. 

5.4  Definition  of  Short  Circuiting 

A  short-circuiting  generator  requires  a  set  of  formulae,  referred  to  as  (F,  and  an  underlying 
generator.  The  short-circuiting  generator  yields  every  assignment  yielded  by  the  underlying 
generator  that  satisfy  all  of  the  formulae  within  (F. 

Definition  48:  For  a  set  of  formulae  “F,  a  level  i  generator  gj^  is  a  short-circuiting  generator  for 
formula  (|)  using  a  level  i  generator  g'j  iff 
yfE  (F.((1)N/a  Vse  S|.i.gi^(s)  =  {s'  I  s'g  g'i(s)  a  ifis']  =  true}). 

A  short-circuiting  generator  is  sound  if  the  underlying  generator  is  sound. 

Theorem  29:  A  level  i  short-circuiting  generator  gj'^  for  formula  (|)  using  a  level  i  generator  g'j  is 
sound  for  PAd((|),(|)')  °  d((l))  if  g'j  is  sound  for  d((|))  where  (|)'  isj^  andj^  andj^  ...andj[ 
with/i.^eF". 

Proof:  Proof  by  construction. 

Assume  s  e  Sn.N(|)[s]  =  true. 

Because  g'j  is  sound  for  d((|)),  g'j  is  sound  for  PAd((|),(|)')  o  d((|))  by  Theorem  9. 
Therefore,  3s'  6  Sn-S  »  s'  a  Var;  <  SG  g'lCVari.^  <  s). 

Because  l=(|)[s]  =  true  a  s  =  s',  i=(|)[s']  =  true. 

Because \^G  !F.l^s']  =  true. 

Therefore,  s'g  gi^(s).  ■ 
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6.  Refining  the  Model 

The  model  of  selective  enumeration  developed  thus  far  is  somewhat  limited.  The  first  subsection 
here  re-introduces  given  types.  Given  types  are  a  partitioning  of  V  that  limits  the  acceptable 
values  for  variables.  The  NP  language  allows  relational  variables  to  be  limited  to  subsets  of 
Valucrei,  such  as  functions,  injections,  or  surjections.  Limiting  the  variable  to  denote  only 
relations  that  are  functions  is  the  most  common  of  these  hmitations.  The  second  subsection  will 
focus  on  this  functional  limitation. 

6.1  Reconsidering  Given  Types 

Given  types  are  disjoint  subsets  of  Tf,  the  universe  of  elements.  In  NP,  a  variable  or  expression  is 
not  simply  restricted  to  denoting  a  scalar,  set,  or  relational  value;  instead,  it  is  restricted  to 
denoting  an  element  of  a  given  type,  or  a  subset  of  a  given  type,  or  a  relation  drawn  from  the  cross 
product  of  given  types. 

This  information  is  most  easily  captured  as  additional  constraints  on  the  formula  under 
consideration.  For  example,  the  uniqueAddrAlloc  claim  is  translated  and  negated  into  Formula  3; 
in  addition  to  the  requirements  given  by  Formula  2,  the  first  four  lines  represent  the  constraint 
implied  by  the  use  of  the  given  types  in  the  variable  declarations. 

Formula  3:  <!)  =  ((((((  dom  usage  <=  Addr  and  ran  usage  <=  Value)  and 

( dom  usage'  <=  Addr  and  ran  usage'  <=  Value) )  and 
(  used'  <=  Addr  and  used  <=  Addr) )  and 
a  in  Addr)  and 

(dom  usage  =  used  and  dom  usage'  =  used') )  and 
( func  usage  and  func  usage' ) )  and 
( ( (used  <:  usage')  =  usage  and  used'  =  (used  U  {a}) ) 
and  a  in  used) ) 

This  new  formula  uses  two  new  variables,  Addr  and  Value,  representing  the  two  given  types. 
These  variables  are  generated  before  any  standard  variables  using  a  special  generator,  called  a 
scope  generator.  The  scope  generator  uses  the  scope,  a  user-provided  mapping  from  given  type  to 
a  number  of  elements  desired  in  each  given  type. 

Definition  49:  VariablCan  =  VariablCscaiar  ^  Variableget  U  VariablCrei  U  Variablctype 

Definition  50:  A  function  arVariablctype  — >  1..N  is  a  Scope  iff  2  ran  a  <  N 

Definition  51:  A  scope  generator  sg:  Variablctype  X  Scope  PTi  is  defined  as 
sg(t,o)  =  T. iTl  =  a(t) AxeT=i>xgU  sg(ti,a) 

ti^t 

Bounded  generation  and  derived  variables  can  now  guarantee  that  only  values  satisfying  the  given 
type  constraints  will  be  generated. 

This  does  have  a  notable  practical  effect  on  isomorph  elimination,  however.  Because  the  given 
types  will  always  be  generated  first,  only  automorphisms  that  do  not  map  any  elements  between 
given  types  will  be  identities  over  any  partial  assignment.  Therefore  the  additional  reductions 


A  Formal  Definition 


41 


gained  by  bounded  generation  includes  many  reductions  lost  in  isomorph  elimination.  However, 
this  restriction  can  also  be  used  to  advantage  in  the  implementation  of  isomorph  elimination;  a 
much  smaller  space  of  automorphisms  need  be  considered  during  generation. 

6.2  Limiting  Relations  to  Functions 

The  original  specification  restricted  the  values  of  usage  tind  usage'  to  be  functions,  whereas  the 
translated  formula  allowed  these  variables  to  be  mapped  to  any  relation.  A  further  improvement 
on  the  formula  translation  is  required.  Formula  4  adds  another  constraint  involving  these  variables 
to  the  formula  given  in  Formula  3. 

Formula  4:  <1^  =  ((((((  ( dom  usage  <=  Addr  and  ran  usage  <=  Value)  and 

( dom  usage'  <=  Addr  and  ran  usage'  <=  Value) )  and 
( used'  <=  Addr  and  used  <=  Addr) )  and 
a  in  Addr)  and 

( func  usage  and  func  usage') )  and 
(dom  usage  =  used  and  dom  usage'  =  used') )  and 
( func  usage  and  func  usage' ) )  and 
( ( (used  <:  usage')  =  usage  and  used'  =  (used  U  {a}) ) 
and  a  in  used) ) 

Short  circuiting  could  easily  handle  these  additional  constraints  directly  by  including  func  usage 
and  func  usage'  as  two  of  the  formulae  in  7.  However,  this  misses  some  important  opportunities 
to  further  reduce  the  number  of  assignments  generated. 

The  exhaustive-enumeration  generator  and  the  isomorph-eliminating  generators  can  be  easily 
enhanced  to  generate  only  functions.  Bounded  generation  can  also  be  extended  to  take  advantage 
of  this  case  without  much  effort. 

Definition  52:  For  any  7,-p,7.  e  Ternise,  and  T,  e  Ternij-ei,  a  level  I  generator  bgj^^  is  a  function- 
aware  bounded  generator  for  (j)  using  a  (^-limiting  generator  iff 
FV(^P)  c  Varj.1  a  FV^)  c  Varj.i  a  FV(^R)  c  Varj.i  a  FV(iE)  q  Vatj.-i  a 
(Vj  G  Vafggajaf 

(|)l=Vj  in  A  =  {}  A  (R  =  {}  A  £E  =  {}  A 
bgi^^^(s)  =  gi^(s))  A 

(VjGVarset=> 

(|)|:Vi  <=  {TUT,)  A  (|)t=5l  <=  Vj  AJ7  =  {}  A  iE  =  {}  A 

bgi^^(s)  =  {  s'  1 3s"  G  gi^(s).  s'=s"u  s(!R)})  a 

(Vj  G  Varjei  A  (t)t=func  Vj  => 

(t)l=dom  Vj  <=  {7  U  dom  7)  a  (t)l=ran  V|  <=jj  a  (j)l=;E  <=  Vj  a  (R  =  {}  a 
bgj^P^(s)  =  {  s'  1 3s"  G  gj^/(s).  s'=s"u  s(fE)})  a 

(Vj  G  Vatrei  A  -i(|)l=func  Vj  => 

(t)Ndom  Vj  <=  (P  A  (j)l=ran  Vj  <=jj  A(R={}AfE  =  {}A 
bgj^P^(s)  =  gi^(s)) 
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This  generator  works  for  functions  much  the  way  the  original  generator  works  for  sets.  iE  is  a  term 
describing  a  set  of  edges  that  must  be  included  in  any  value  generated.  Because  the  value  must  be 
a  function,  the  domain  of  (E  can  be  excluded  from  9. 
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7.  Conclusion 

This  paper  has  introduced  a  formal  framework  for  describing  selective  enumeration  and  the 
techniques  that  implement  it.  In  related  papers,  I  define  specific  algorithms  for  implementing  each 
technique  and  show  that  they  lead  to  sound  generators.  I  also  define  an  algorithm  for  discovering  a 
set  of  constraining  formulae,  as  needed  by  partial  assignment  duplications,  and  an  algorithm  for 
selecting  an  ordering  that  takes  substantial  advantage  of  the  reduction  opportunities. 

7.1  Future  Work 

One  potential  future  direction  is  the  search  for  additional  forms  of  duplication.  Although  the  two 
duplications  I  describe  in  this  paper  are  effective  at  reducing  the  number  of  assignments 
generated,  further  duplications  would  enable  additional  specifications  to  be  analyzed. 

Another  possible  direction  is  analyzing  different  input  languages.  A  promising  candidate  is 
OCL[IBM97],  the  constraint  language  recently  defined  for  the  object  specification  notation  UML. 
Most  of  OCL  can  be  directly  translated  into  the  formula  language  I  use  here.  The  object 
orientation  introduces  some  new  concepts,  particularly  inheritance,  that  require  additional 
consideration  and  may  enable  additional  constraints. 

A  final  possible  direction  involves  applying  the  general  approach  of  selective  enumeration  to 
incremental  search  problems  in  other,  non-relational  domains.  If  any  easily  computable  features 
can  be  defined  that  distinguish  interesting  and  non-interesting  instances,  the  framework  I 
described  in  this  paper  can  be  applied. 

As  an  example,  consider  the  problem  of  TF-sensitive  test  generation  [CM94].  A  test  T  for  a 
boolean  expression  E  is  sensitive  to  a  variable  x  in  E  if  changing  only  the  value  of  x  changes  the 
result  of  T  for  E.  The  goal  in  test  generation  is  to  obtain  a  near-minimal  test  set  that  contains  at 
least  one  test  that  is  sensitive  for  each  variable.  Like  the  relational  satisfaction  problem,  the 
general  solution  is  exponential. 

Conceptually,  a  selective-enumeration-Uke  search  could  be  used  to  find  such  a  set.  It  seems  likely 
that  techniques  could  be  developed  that  efficiently  rule  out  tests  as  duplicates  if  they  are  not 
sensitive  to  any  variables  not  already  tested  by  a  test  in  the  set. 
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